Message ID | 20230328-soc-mailbox-v1-0-3953814532fd@marcan.st (mailing list archive) |
---|---|
Headers | show |
Series | mailbox: apple: Move driver into soc/apple and stop using the subsystem | expand |
This resolves the firmware crashes related to queue overflow that I hit. Thank you for fixing this, Hector! Series is Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> [I'd give a stronger tag but I'm way out of my depths here for review. ++ on the ideas though.] March 28, 2023 9:14 AM, "Hector Martin" <marcan@marcan.st> wrote: > Once upon a time, Apple machines had some mailbox hardware, and we had > to write a driver for it. And since it was a mailbox, it felt natural to > use the Linux mailbox subsystem. > > More than a year later, it has become evident that was not the right > decision. > > Linux driver class subsystems generally exist for a few purposes: > 1. To allow mixing and matching generic producers and consumers. > 2. To implement common code that is likely to be shared across drivers, > and do so correctly so correct code only has to be written once. > 3. To force drivers into a "correct" design, such that driver authors > avoid common pitfalls. > > The mailbox subsystem does not do any of the above for us: > 1. Mailbox hardware is not generic; "mailbox" is a vague idea, not a > standard for communication. Almost all mailbox consumers are tied to > one or a few producers. There is practically no mixing and matching > possible. We have one (1) consumer subsystem (RTKit) talking to one > (1) mailbox driver (apple-mailbox). We might have a second consumer > in the future (SEP), but there will still be no useful > combinatronics with other drivers. > 2. The mailbox subsystem implements a bunch of common code for queuing, > but we don't need that because our hardware has hardware queues. It > also implements a bunch of common code for supporting multiple > channels, but we don't need that because our hardware only has one > channel (it has "endpoints" but those are just tags that are part of > the message). On top of this, the mailbox subsystem makes design > decisions unsuitable for our use case. Its queuing implementation > has a fixed queue size and fails sends when full instead of pushing > back by blocking, which is completely unsuitable for high-traffic > mailboxes with hard reliability requirements, such as ours. We've > also run into multiple issues around using mailbox in an atomic > context (required for SMC reboot/shutdown requests). > 3. Mailbox doesn't really do much for us as far as driver design. > If anything, it has been forcing us to add extra workarounds for the > impedance mismatches between the subsystem core and the hardware. > Other drivers already are inconsistent in how they use the mailbox > core, because the documentation is unclear on various details. > > The interface for mailboxes is very simple - basically just a send() and > a receive callback. This series quite literally just rips out the > middleman, and connects both sides together directly. There just isn't > anything useful the mailbox common code is doing for us - it's just a > pile of complexity in the middle that adds bugs, impedance mismatches, > overhead, and offers no extra features we can use. > > This series offers: > > - A modest reduction in overall code size (-27 net lines excluding #1). > - Fixes a pile of bugs related to using the mailbox subsystem and its > quirks and limitations (race conditions when messages are already in > the queue on startup, atomic issues, the list goes on). > - Adds runtime-PM support. > - Adds support for the FIFOs in the mailbox hardware, improving > performance. > - Simplifies code by removing extraneous memory allocations (the > mailbox subsystem requires consumers to pass pointers to messages, > instead of inlining them, even though messages are usually only one or > two registers in size) and the requisite cleanup/freeing in the > completion path. > > In addition, it paves the way for future Apple-specific mailbox > optimizations, such as adding the ability to de-duplicate message sends > if the same message is already known to be in the FIFO (which can be > done by keeping a rolling history of recently sent messages). This is > useful for doorbell-style messages, which are redundant to send more > than once if not yet processed. > > Apple Silicon platforms use these mailboxes pervasively, including as > part of the GPU submission hot path. On top of that, bad interactions > with firmware coprocessors can cause immediate lockups or crashes with > no recovery possible but a reboot. Our requirements for reliability and > performance are probably much higher than the average mailbox user, and > we'd much rather not have a bunch of common code getting in the way of > performance profiling and future optimization. It doesn't make much > sense for the mailbox subsystem either, since solving these issues would > require major refactoring that is unlikely to provide significant > benefit to most other users. > > So let's just call usage of the mailbox subsystem a failed experiment, > and move the driver into soc/apple, where we can control the API and can > add whatever peculiarities we need for our mailboxes. Farewell, mailbox. > > There are no changes to the DT bindings. This driver has been shipping > in Asahi stable kernel packages for a week, with no regressions > reported by any users. > > As an additional non-kernel-related benefit, this introduces a direct > module dependency between consumers and the mailbox producer. This, in > turn, is in the critical path for the NVMe driver on these platforms. > Prior to this series, we had to have custom initramfs hooks to add > apple-mailbox to distro initramfses, and accidentally removing these > hooks would result in a completely unbootable system (there is no way > for standard initramfs machinery to detect soft consumer/producer > relationships like this, they usually just go looking for block device > modules and their direct dependencies). We still need the initramfs > hooks for other things, but with this change, completely removing all > Apple-related initramfs hooks will at least result in a *bootable* > system so you can fix the problem. This has already bit several users, > and it also means many more distros have a chance of working out of the > box (enough to let you install extra stuff) on these platforms, instead > of having a hard requirement that *every single distro* fix up their > initramfs generation in order to even boot/install on these platforms at > all. > > Jassi: Ideally I'd like to get an ack on this and merge it all through > asahi-soc, so we don't have to play things patch-by-patch across > multiple merge cycles to avoid potentially broken intermediate states. > > Signed-off-by: Hector Martin <marcan@marcan.st> > --- > Hector Martin (5): > soc: apple: rtkit: Get rid of apple_rtkit_send_message_wait > soc: apple: mailbox: Add ASC/M3 mailbox driver > soc: apple: rtkit: Port to the internal mailbox driver > mailbox: apple: Delete driver > soc: apple: mailbox: Rename config symbol to APPLE_MAILBOX > > MAINTAINERS | 2 - > drivers/mailbox/Kconfig | 12 - > drivers/mailbox/Makefile | 2 - > drivers/mailbox/apple-mailbox.c | 441 ------------------------------------- > drivers/soc/apple/Kconfig | 15 +- > drivers/soc/apple/Makefile | 3 + > drivers/soc/apple/mailbox.c | 434 ++++++++++++++++++++++++++++++++++++ > drivers/soc/apple/mailbox.h | 48 ++++ > drivers/soc/apple/rtkit-internal.h | 8 +- > drivers/soc/apple/rtkit.c | 133 +++-------- > include/linux/apple-mailbox.h | 19 -- > include/linux/soc/apple/rtkit.h | 18 -- > 12 files changed, 529 insertions(+), 606 deletions(-) > --- > base-commit: bdfe6de2695c5bccc663a5a7d530f81925d8bc10 > change-id: 20230328-soc-mailbox-3cb6bb2b0b2d > > Best regards, > -- > Hector Martin <marcan@marcan.st>
On Tue, Mar 28, 2023 at 8:14 AM Hector Martin <marcan@marcan.st> wrote: > > Once upon a time, Apple machines had some mailbox hardware, and we had > to write a driver for it. And since it was a mailbox, it felt natural to > use the Linux mailbox subsystem. > > More than a year later, it has become evident that was not the right > decision. > > Linux driver class subsystems generally exist for a few purposes: > 1. To allow mixing and matching generic producers and consumers. > 2. To implement common code that is likely to be shared across drivers, > and do so correctly so correct code only has to be written once. > 3. To force drivers into a "correct" design, such that driver authors > avoid common pitfalls. > The Mailbox subsystem is meant to serve (2) and possibly (3). We never aimed for (1), we can't... because the firmware on the remote end is also a part of "local hardware" -- keeping every bit of hardware the same, if just the f/w changes you may have to change the linux side driver. > The mailbox subsystem does not do any of the above for us: > 1. Mailbox hardware is not generic; "mailbox" is a vague idea, not a > standard for communication. > There _can not_ be a standard mailbox implementation without a specification -- which doesn't exist. > Almost all mailbox consumers are tied to > one or a few producers. There is practically no mixing and matching > possible. We have one (1) consumer subsystem (RTKit) talking to one > (1) mailbox driver (apple-mailbox). We might have a second consumer > in the future (SEP), but there will still be no useful > combinatronics with other drivers. > Sorry I don't follow what you expect. > 2. The mailbox subsystem implements a bunch of common code for queuing, > but we don't need that because our hardware has hardware queues. It > also implements a bunch of common code for supporting multiple > channels, but we don't need that because our hardware only has one > channel (it has "endpoints" but those are just tags that are part of > the message). > In note-1, you complain it is not standard/flexible enough, and here its support for features, that you don't need, become a problem? > On top of this, the mailbox subsystem makes design > decisions unsuitable for our use case. Its queuing implementation > has a fixed queue size and fails sends when full instead of pushing > back by blocking, which is completely unsuitable for high-traffic > mailboxes with hard reliability requirements, such as ours. We've > also run into multiple issues around using mailbox in an atomic > context (required for SMC reboot/shutdown requests). > I don't think you ever shared the issues you were struggling with. Not to mean there can be no limitation but I knew a platform that ran a virtual block-device and custom filesystem over the mailbox, and it was good for a camera product that needs high bandwidth image transfer. > 3. Mailbox doesn't really do much for us as far as driver design. > If anything, it has been forcing us to add extra workarounds for the > impedance mismatches between the subsystem core and the hardware. > Other drivers already are inconsistent in how they use the mailbox > core, because the documentation is unclear on various details. > Again, would have helped to know the issues and details you feel missing. > This series offers: > > - A modest reduction in overall code size (-27 net lines excluding #1). > - Fixes a pile of bugs related to using the mailbox subsystem and its > quirks and limitations (race conditions when messages are already in > the queue on startup, atomic issues, the list goes on). > - Adds runtime-PM support. > - Adds support for the FIFOs in the mailbox hardware, improving > performance. > - Simplifies code by removing extraneous memory allocations (the > mailbox subsystem requires consumers to pass pointers to messages, > instead of inlining them, even though messages are usually only one or > two registers in size) and the requisite cleanup/freeing in the > completion path. > > In addition, it paves the way for future Apple-specific mailbox > optimizations, such as adding the ability to de-duplicate message sends > if the same message is already known to be in the FIFO (which can be > done by keeping a rolling history of recently sent messages). This is > useful for doorbell-style messages, which are redundant to send more > than once if not yet processed. > > Apple Silicon platforms use these mailboxes pervasively, including as > part of the GPU submission hot path. On top of that, bad interactions > with firmware coprocessors can cause immediate lockups or crashes with > no recovery possible but a reboot. Our requirements for reliability and > performance are probably much higher than the average mailbox user, and > we'd much rather not have a bunch of common code getting in the way of > performance profiling and future optimization. It doesn't make much > sense for the mailbox subsystem either, since solving these issues would > require major refactoring that is unlikely to provide significant > benefit to most other users. > > So let's just call usage of the mailbox subsystem a failed experiment, > and move the driver into soc/apple, where we can control the API and can > add whatever peculiarities we need for our mailboxes. Farewell, mailbox. > > There are no changes to the DT bindings. This driver has been shipping > in Asahi stable kernel packages for a week, with no regressions > reported by any users. > > As an additional non-kernel-related benefit, this introduces a direct > module dependency between consumers and the mailbox producer. This, in > turn, is in the critical path for the NVMe driver on these platforms. > Prior to this series, we had to have custom initramfs hooks to add > apple-mailbox to distro initramfses, and accidentally removing these > hooks would result in a completely unbootable system (there is no way > for standard initramfs machinery to detect soft consumer/producer > relationships like this, they usually just go looking for block device > modules and their direct dependencies). We still need the initramfs > hooks for other things, but with this change, completely removing all > Apple-related initramfs hooks will at least result in a *bootable* > system so you can fix the problem. This has already bit several users, > and it also means many more distros have a chance of working out of the > box (enough to let you install extra stuff) on these platforms, instead > of having a hard requirement that *every single distro* fix up their > initramfs generation in order to even boot/install on these platforms at > all. > > Jassi: Ideally I'd like to get an ack on this and merge it all through > asahi-soc, so we don't have to play things patch-by-patch across > multiple merge cycles to avoid potentially broken intermediate states. > Instead of every platform implementing their own message queuing and handling mechanism and parsing producer-comsumer links from the DT, there is a reusable code in drivers/mailbox. Which does seem inadequate if one comes looking for a "standard/generic" mailbox implementation (again, which can not exist). And there is a reason the reduction in code with this patchset is meager -- you anyway have to implement your platform specific stuff. But if redoing mailbox overall saves you complexity, I am ok with it. cheers.
On 30/03/2023 01.04, Jassi Brar wrote: > On Tue, Mar 28, 2023 at 8:14 AM Hector Martin <marcan@marcan.st> wrote: >> >> Once upon a time, Apple machines had some mailbox hardware, and we had >> to write a driver for it. And since it was a mailbox, it felt natural to >> use the Linux mailbox subsystem. >> >> More than a year later, it has become evident that was not the right >> decision. >> >> Linux driver class subsystems generally exist for a few purposes: >> 1. To allow mixing and matching generic producers and consumers. >> 2. To implement common code that is likely to be shared across drivers, >> and do so correctly so correct code only has to be written once. >> 3. To force drivers into a "correct" design, such that driver authors >> avoid common pitfalls. >> > The Mailbox subsystem is meant to serve (2) and possibly (3). > We never aimed for (1), we can't... because the firmware on the remote > end is also a part of "local hardware" -- keeping every bit of > hardware the same, if just the f/w changes you may have to change the > linux side driver. Right, I'm just explaining why this isn't like i2c/spi and similar subsystems. The value proposition is significantly weaker with mailbox for this reason. >> 2. The mailbox subsystem implements a bunch of common code for queuing, >> but we don't need that because our hardware has hardware queues. It >> also implements a bunch of common code for supporting multiple >> channels, but we don't need that because our hardware only has one >> channel (it has "endpoints" but those are just tags that are part of >> the message). >> > In note-1, you complain it is not standard/flexible enough, and here > its support for features, that you don't need, become a problem? It has features we don't need that introduce bugs and complexity, while missing features we *do* need. So yes, it's doubly a problem. >> On top of this, the mailbox subsystem makes design >> decisions unsuitable for our use case. Its queuing implementation >> has a fixed queue size and fails sends when full instead of pushing >> back by blocking, which is completely unsuitable for high-traffic >> mailboxes with hard reliability requirements, such as ours. We've >> also run into multiple issues around using mailbox in an atomic >> context (required for SMC reboot/shutdown requests). >> > I don't think you ever shared the issues you were struggling with. I did try to send a patch clarifying/cleaning up inconsistent usage of the atomic codepath in other drivers, and you rejected it. At that point I gave up in trying to contribute to cleaning up the existing mess, because you're clearly not interested. I have some downstream fixes for the atomic codepath and I don't think they even fix all the problems, because we still run into issues. The design of the mailbox core is just bizarre and makes it very hard to reason about the corner cases through the whole queuing business. It's a big chunk of complexity and bug-prone design and it just doesn't make any sense for us to spend time on this if cleaning things up is going to be an uphill battle *and* in the end the subsystem still does nothing useful for us. It's just using the subsystem for the sake of using it at that point. It makes no sense. As for the other issue, there's a giant comment around the queue length define: > /* > * The length of circular buffer for queuing messages from a client. > * 'msg_count' tracks the number of buffered messages while 'msg_free' > * is the index where the next message would be buffered. > * We shouldn't need it too big because every transfer is interrupt > * triggered and if we have lots of data to transfer, the interrupt > * latencies are going to be the bottleneck, not the buffer length. > * Besides, mbox_send_message could be called from atomic context and > * the client could also queue another message from the notifier 'tx_done' > * of the last transfer done. > * REVISIT: If too many platforms see the "Try increasing MBOX_TX_QUEUE_LEN" > * print, it needs to be taken from config option or somesuch. > */ > #define MBOX_TX_QUEUE_LEN 20 This issue is clearly known, and it doesn't take a lot of thinking to realize that *any* queue length limit coupled with hard-fails on message sends instead of pushback is just unsuitable for many use cases. Maybe all existing mailbox users have intrinsically synchronous use cases that keep the queue idle enough, or maybe they're just broken only in corner cases that haven't come back to the mailbox subsystem yet. Either way, as far as I'm concerned this is broken by design in the general case. > Not to mean there can be no limitation but I knew a platform that ran > a virtual block-device and custom filesystem over the mailbox, and it > was good for a camera product that needs high bandwidth image > transfer. I guess it worked for them, but it's been breaking our GPU use case. So if it's working for other people, I guess that's fine. But it's definitely not working for us, and I'm not inclined to try to fix the issues if nobody else has a problem. Working on common code has the inherent risk that any changes can break other users and I have no way to test every other mailbox user. There are real *costs* associated with having common code, especially for boutique things like mailbox which are highly heterogeneous and used on all kinds of weird systems no single developer can hope to have access to. So if you say it's working well for other people, all the more reason for us to move away and leave them alone instead of trying to make major changes to fix everything that isn't working for us and risk breaking others. >> Jassi: Ideally I'd like to get an ack on this and merge it all through >> asahi-soc, so we don't have to play things patch-by-patch across >> multiple merge cycles to avoid potentially broken intermediate states. >> > Instead of every platform implementing their own message queuing and > handling mechanism and parsing producer-comsumer links from the DT, > there is a reusable code in drivers/mailbox. As I said, we don't need nor want message queuing (yes, this is the bulk of the complexity of the subsystem and really the source of a lot of the pain, especially once you start throwing atomic into the mix). Other mailbox hardware might, and that's fine. It's just not for us. Especially not if it falls over when the queue gets full like it does. I'm not saying the mailbox subsystem shouldn't exist, I'm saying it might be a good idea as it stands for *some* mailbox producers/consumers/use cases, but not *all*, and in this case, not ours. As for the DT links, that's ~25 lines of code. > And there is a reason the reduction in code with this patchset is > meager -- you anyway have to implement your platform specific stuff. Think about it for a second. We are ceasing use of the subsystem, which under any reasonable expectation would require us to *add* code to duplicate functionality. Indeed there's one new small function to deal with the OF phandle lookup. And there's also a whole new header file with prototypes that are a facsimile of the core mailbox API too. But overall the total LOC is down, because we more than make that up by removing overhead caused by having to bend to the wills of the subsystem and work around its issues (and we'd be removing even more if the upstream driver already had the new features like runtime-PM and workarounds for more issues). > But if redoing mailbox overall saves you complexity, I am ok with it. Is that an ack? :-) - Hector
On Wed, Mar 29, 2023 at 5:53 PM Hector Martin <marcan@marcan.st> wrote: > On 30/03/2023 01.04, Jassi Brar wrote: > >> On top of this, the mailbox subsystem makes design > >> decisions unsuitable for our use case. Its queuing implementation > >> has a fixed queue size and fails sends when full instead of pushing > >> back by blocking, which is completely unsuitable for high-traffic > >> mailboxes with hard reliability requirements, such as ours. We've > >> also run into multiple issues around using mailbox in an atomic > >> context (required for SMC reboot/shutdown requests). > >> > > I don't think you ever shared the issues you were struggling with. > > I did try to send a patch clarifying/cleaning up inconsistent usage of > the atomic codepath in other drivers, and you rejected it. At that point > I gave up in trying to contribute to cleaning up the existing mess, > because you're clearly not interested. > You mean https://lore.kernel.org/lkml/20220502090225.26478-6-marcan@marcan.st/ Now I see where this code-rage comes from. But let me clarify even more... You do not kill some api just because you don't need that and/or you think that is "stupid" because you can't see past your own use-case. Peek'ing is a valid usecase. If you need Polling, that does not mean other platforms are broken to need to Peek. You can not declare all platforms must be able to fetch data from remote in atomic context. Think of a platform that must do some sleepy calls to fetch data ? Or a large buffer copy is involved. If your platform can read and pass-on data in atomic context, you can still implement that around the existing peek api (ok a few extra loc involved). Even if we see that the api must provide polling, we may add a new callback or 'overload' peek.... but still you can not kill peek. If you do, I am sure another revolutionary will arise in few months finding the atomic-read requirement "stupid" and either revert poll or reintroduce peek :) > > As for the other issue, there's a giant comment around the queue length > define: > > > /* > > * The length of circular buffer for queuing messages from a client. > > * 'msg_count' tracks the number of buffered messages while 'msg_free' > > * is the index where the next message would be buffered. > > * We shouldn't need it too big because every transfer is interrupt > > * triggered and if we have lots of data to transfer, the interrupt > > * latencies are going to be the bottleneck, not the buffer length. > > * Besides, mbox_send_message could be called from atomic context and > > * the client could also queue another message from the notifier 'tx_done' > > * of the last transfer done. > > * REVISIT: If too many platforms see the "Try increasing MBOX_TX_QUEUE_LEN" > > * print, it needs to be taken from config option or somesuch. > > */ > > #define MBOX_TX_QUEUE_LEN 20 > > This issue is clearly known, and it doesn't take a lot of thinking to > realize that *any* queue length limit coupled with hard-fails on message > sends instead of pushback is just unsuitable for many use cases. Maybe > all existing mailbox users have intrinsically synchronous use cases that > keep the queue idle enough, or maybe they're just broken only in corner > cases that haven't come back to the mailbox subsystem yet. Either way, > as far as I'm concerned this is broken by design in the general case. > You may be surprised but I do understand hardcoding limits on buffer size is taboo.... unless benefits outweigh fingerpointing :) 1) Using an array here greatly simplifies the code by avoiding code to dynamically shrink/expand the linked list while constrained by the atomic context. Using array with head and tail indices is a fast and simplest way to implement a ring buffer. I also intented the api to have least footprint on the throughput and resources (my own initial usecase was high speed 4k image transfer). 2) The api relies on last_tx_done() to make sure we submit data only when we have an all-clear ... which is a platform specific way to ensure signal will physically reach the remote (whether data is accepted or not is upto the upper layer protocol and that's why it is recommended to pass pointer to data, rather than data as the signal). The way api is recommended (not required) to be used, the limit on TX_QUEUE_LEN(20), is not impactful beyond the fifo size of the controller. Though I am open to idea of seeing if tx failure should be considered a possiblity even after last_tx_done. Iirc on lkml, people have reported using 1000s tx calls per second within this queue limit. I don't know how you tried to interpret that limit but would have helped to know your issue. > > > But if redoing mailbox overall saves you complexity, I am ok with it. > > Is that an ack? :-) > You sound like being trapped in a bad marriage with mailbox :) And I really don't want you to stay in a rewardless situation --- I have actually asked some platforms during RFCs if mailbox is really useful for them (usually SMC/HVC based useage), but they found use. Please make sure it is not just code-rage of your patchset being rejected, and indeed there are things you can't do with the api. Because the api can not have Zero impact on any platform's implementation and my ack here could be used as a precedent for every platform to implement their own tx/rx queues and dt handling and move into drivers/soc/. A couple years later someone will see it doesn't make sense every platform is doing the same thing in driver/soc/ and maybe it s a good idea to have some drivers/mailbox/ to hold the common code. I am also aware I am just a volunteer at mailbox and can not dictate what you do with your platform. So, is there anything like Neither-acked-nor-objected-but-left-to-soc-by ? ;) cheers.
On 31/03/2023 01.35, Jassi Brar wrote: > On Wed, Mar 29, 2023 at 5:53 PM Hector Martin <marcan@marcan.st> wrote: >> On 30/03/2023 01.04, Jassi Brar wrote: > >>>> On top of this, the mailbox subsystem makes design >>>> decisions unsuitable for our use case. Its queuing implementation >>>> has a fixed queue size and fails sends when full instead of pushing >>>> back by blocking, which is completely unsuitable for high-traffic >>>> mailboxes with hard reliability requirements, such as ours. We've >>>> also run into multiple issues around using mailbox in an atomic >>>> context (required for SMC reboot/shutdown requests). >>>> >>> I don't think you ever shared the issues you were struggling with. >> >> I did try to send a patch clarifying/cleaning up inconsistent usage of >> the atomic codepath in other drivers, and you rejected it. At that point >> I gave up in trying to contribute to cleaning up the existing mess, >> because you're clearly not interested. >> > You mean https://lore.kernel.org/lkml/20220502090225.26478-6-marcan@marcan.st/ > Now I see where this code-rage comes from. > > But let me clarify even more... > You do not kill some api just because you don't need that and/or you > think that is "stupid" because you can't see past your own use-case. It is general Linux kernel policy not to have internal APIs with zero users. The Rust folks get pushback all the time for upstreaming stuff piecewise even though in that case there are known, upcoming, in-the-pipeline users (we do that too with rtkit but we always have upcoming users downstream and we're small enough nobody notices and complains :P). Having dead APIs that nobody uses and nobody can point at an upcoming use case for is technical debt. That's why my first patch in this series cleans up one of those on our side. >> This issue is clearly known, and it doesn't take a lot of thinking to >> realize that *any* queue length limit coupled with hard-fails on message >> sends instead of pushback is just unsuitable for many use cases. Maybe >> all existing mailbox users have intrinsically synchronous use cases that >> keep the queue idle enough, or maybe they're just broken only in corner >> cases that haven't come back to the mailbox subsystem yet. Either way, >> as far as I'm concerned this is broken by design in the general case. >> > You may be surprised but I do understand hardcoding limits on buffer > size is taboo.... unless benefits outweigh fingerpointing :) Using a fixed size buffer is not the problem, having no blocking mechanism when it gets full is the problem. > 2) The api relies on last_tx_done() to make sure we submit data only > when we have an all-clear ... That's not the issue, the issue is putting stuff *into* the queue, not taking it *out* of the queue and sending it to the hardware. > which is a platform specific way to > ensure signal will physically reach the remote (whether data is > accepted or not is upto the upper layer protocol and that's why it is > recommended to pass pointer to data, rather than data as the signal). > The way api is recommended (not required) to be used, the limit on > TX_QUEUE_LEN(20), is not impactful beyond the fifo size of the > controller. Though I am open to idea of seeing if tx failure should be > considered a possiblity even after last_tx_done. If I do this: for (int i = 0; i < 30; i++) { mbox_send_message(...); } Then, unless the remote is fast enough to accept messages faster than the CPU can send them, some of those sends will fail and refuse to send data, once the subsystem side queue is full. That is broken because it either loses data or forces the user to implement retry poll loops, neither of which is appropriate. The mailbox subsystem knows when the hardware can send data, it can properly block the send on that signal (which is exactly what my refactor in this series does when the hardware queue gets full). If we instead wait for the tx completion stuff before sending, then that defeats the point of having a queue because you'd be waiting for each prior message before sending a new one. And then you need to keep track of the last completion. And it requires a global serialization on the client side anyway unless you can guarantee you have less than QUEUE_SIZE clients. And you still have the issue of having to keep the message data around until that completion fires, which is more code and allocator overhead over just passing it inline, since it's a tiny amount of data. Etc etc etc. It is a bad API, using it properly and reliably requires basically re-implementing part of the subsystem logic in the consumer to work around the issues. > Iirc on lkml, people have reported using 1000s tx calls per second > within this queue limit. I don't know how you tried to interpret that > limit but would have helped to know your issue. For reference: Our GPU benchmarks will easily hit 18000+ TX calls per second through mailbox, even more for some corner cases (this is why I want to implement coalescing when the HW queue already has identical doorbells, to reduce that). More importantly, although the GPU side firmware is usually fast at processing this (it has an IRQ handler and its own doorbell coalescing), when GPU faults or errors happen it has latency spikes, and then we *must* block mailbox sends until it is ready to handle messages again. Dropping messages on the floor is not an option. This *has* to be 100% reliable or users' machines crash. >> >>> But if redoing mailbox overall saves you complexity, I am ok with it. >> >> Is that an ack? :-) >> > You sound like being trapped in a bad marriage with mailbox :) And > I really don't want you to stay in a rewardless situation --- I have > actually asked some platforms during RFCs if mailbox is really useful > for them (usually SMC/HVC based useage), but they found use. > Please make sure it is not just code-rage of your patchset being > rejected, and indeed there are things you can't do with the api. It isn't. There's no code rage here, that patch was a long time ago. What that patch told me was that cleaning up mailbox to work for us was going to be an uphill battle, and then over the course of the year+ after that it has become very evident that there is a lot of work to do to make mailbox work for us. Hence, the conclusion that we're better off without. Honestly, at this point, even without that rejection I'd still want to move away because there's just so much work to do to get all the features we need and bugs we're hitting fixed and no realistic way to test other consumers/drivers to make sure we don't break them in the process. > Because the api can not have Zero impact on any platform's > implementation and my ack here could be used as a precedent for every > platform to implement their own tx/rx queues and dt handling and move > into drivers/soc/. As I said, there's a very clear sign here that this is the right move: the overall code size goes down. After this series we have: - Less code in total (much less actually executing) - That works better - And is easier to understand and debug - And requires less maintenance effort to improve If other platforms come to the same conclusion for their use case then yes, they should move away from mailbox as well. I would expect that might be the case for a subset, not all, of users. If more users follow, that should be a sign to you that the mailbox subsystem isn't as useful as you'd like :) Put another way: common code should actually save you lines of code. If it's causing you to spend more lines of code to use it properly than it saves, it is not useful and does not actually improve the situation. > A couple years later someone will see it doesn't> make sense every> platform is doing the same thing in driver/soc/ and > maybe it s a good idea to have some drivers/mailbox/ to hold the > common code. If they are really doing the same thing, sure. And then might be a good time to re-think mailbox and what it should do and how it should offer this common code to drivers, in a way that works for more users and actually saves everyone time and maintenance effort, with less burden. > I am also aware I am just a volunteer at mailbox and can not dictate > what you do with your platform. So, is there anything like > Neither-acked-nor-objected-but-left-to-soc-by ? ;) Not really, because it's your subsystem so we do actually need you to ack the driver deletion patch if it's going to go through our tree. That's the rules. "Acked" doesn't mean "I am happy with this", it means "I am okay with this" ;) - Hector
On Tue, Mar 28, 2023 at 9:14 AM Hector Martin <marcan@marcan.st> wrote: > > Once upon a time, Apple machines had some mailbox hardware, and we had > to write a driver for it. And since it was a mailbox, it felt natural to > use the Linux mailbox subsystem. > > More than a year later, it has become evident that was not the right > decision. > > Linux driver class subsystems generally exist for a few purposes: > 1. To allow mixing and matching generic producers and consumers. > 2. To implement common code that is likely to be shared across drivers, > and do so correctly so correct code only has to be written once. > 3. To force drivers into a "correct" design, such that driver authors > avoid common pitfalls. > > The mailbox subsystem does not do any of the above for us: > 1. Mailbox hardware is not generic; "mailbox" is a vague idea, not a > standard for communication. Almost all mailbox consumers are tied to > one or a few producers. There is practically no mixing and matching > possible. We have one (1) consumer subsystem (RTKit) talking to one > (1) mailbox driver (apple-mailbox). We might have a second consumer > in the future (SEP), but there will still be no useful > combinatronics with other drivers. > 2. The mailbox subsystem implements a bunch of common code for queuing, > but we don't need that because our hardware has hardware queues. It > also implements a bunch of common code for supporting multiple > channels, but we don't need that because our hardware only has one > channel (it has "endpoints" but those are just tags that are part of > the message). On top of this, the mailbox subsystem makes design > decisions unsuitable for our use case. Its queuing implementation > has a fixed queue size and fails sends when full instead of pushing > back by blocking, which is completely unsuitable for high-traffic > mailboxes with hard reliability requirements, such as ours. We've > also run into multiple issues around using mailbox in an atomic > context (required for SMC reboot/shutdown requests). > 3. Mailbox doesn't really do much for us as far as driver design. > If anything, it has been forcing us to add extra workarounds for the > impedance mismatches between the subsystem core and the hardware. > Other drivers already are inconsistent in how they use the mailbox > core, because the documentation is unclear on various details. > > The interface for mailboxes is very simple - basically just a send() and > a receive callback. This series quite literally just rips out the > middleman, and connects both sides together directly. There just isn't > anything useful the mailbox common code is doing for us - it's just a > pile of complexity in the middle that adds bugs, impedance mismatches, > overhead, and offers no extra features we can use. > > This series offers: > > - A modest reduction in overall code size (-27 net lines excluding #1). > - Fixes a pile of bugs related to using the mailbox subsystem and its > quirks and limitations (race conditions when messages are already in > the queue on startup, atomic issues, the list goes on). > - Adds runtime-PM support. > - Adds support for the FIFOs in the mailbox hardware, improving > performance. > - Simplifies code by removing extraneous memory allocations (the > mailbox subsystem requires consumers to pass pointers to messages, > instead of inlining them, even though messages are usually only one or > two registers in size) and the requisite cleanup/freeing in the > completion path. > > In addition, it paves the way for future Apple-specific mailbox > optimizations, such as adding the ability to de-duplicate message sends > if the same message is already known to be in the FIFO (which can be > done by keeping a rolling history of recently sent messages). This is > useful for doorbell-style messages, which are redundant to send more > than once if not yet processed. > > Apple Silicon platforms use these mailboxes pervasively, including as > part of the GPU submission hot path. On top of that, bad interactions > with firmware coprocessors can cause immediate lockups or crashes with > no recovery possible but a reboot. Our requirements for reliability and > performance are probably much higher than the average mailbox user, and > we'd much rather not have a bunch of common code getting in the way of > performance profiling and future optimization. It doesn't make much > sense for the mailbox subsystem either, since solving these issues would > require major refactoring that is unlikely to provide significant > benefit to most other users. > > So let's just call usage of the mailbox subsystem a failed experiment, > and move the driver into soc/apple, where we can control the API and can > add whatever peculiarities we need for our mailboxes. Farewell, mailbox. > > There are no changes to the DT bindings. This driver has been shipping > in Asahi stable kernel packages for a week, with no regressions > reported by any users. > > As an additional non-kernel-related benefit, this introduces a direct > module dependency between consumers and the mailbox producer. This, in > turn, is in the critical path for the NVMe driver on these platforms. > Prior to this series, we had to have custom initramfs hooks to add > apple-mailbox to distro initramfses, and accidentally removing these > hooks would result in a completely unbootable system (there is no way > for standard initramfs machinery to detect soft consumer/producer > relationships like this, they usually just go looking for block device > modules and their direct dependencies). We still need the initramfs > hooks for other things, but with this change, completely removing all > Apple-related initramfs hooks will at least result in a *bootable* > system so you can fix the problem. This has already bit several users, > and it also means many more distros have a chance of working out of the > box (enough to let you install extra stuff) on these platforms, instead > of having a hard requirement that *every single distro* fix up their > initramfs generation in order to even boot/install on these platforms at > all. > > Jassi: Ideally I'd like to get an ack on this and merge it all through > asahi-soc, so we don't have to play things patch-by-patch across > multiple merge cycles to avoid potentially broken intermediate states. > > Signed-off-by: Hector Martin <marcan@marcan.st> > --- > Hector Martin (5): > soc: apple: rtkit: Get rid of apple_rtkit_send_message_wait > soc: apple: mailbox: Add ASC/M3 mailbox driver > soc: apple: rtkit: Port to the internal mailbox driver > mailbox: apple: Delete driver > soc: apple: mailbox: Rename config symbol to APPLE_MAILBOX > > MAINTAINERS | 2 - > drivers/mailbox/Kconfig | 12 - > drivers/mailbox/Makefile | 2 - > drivers/mailbox/apple-mailbox.c | 441 ------------------------------------- > drivers/soc/apple/Kconfig | 15 +- > drivers/soc/apple/Makefile | 3 + > drivers/soc/apple/mailbox.c | 434 ++++++++++++++++++++++++++++++++++++ > drivers/soc/apple/mailbox.h | 48 ++++ > drivers/soc/apple/rtkit-internal.h | 8 +- > drivers/soc/apple/rtkit.c | 133 +++-------- > include/linux/apple-mailbox.h | 19 -- > include/linux/soc/apple/rtkit.h | 18 -- > 12 files changed, 529 insertions(+), 606 deletions(-) > --- > base-commit: bdfe6de2695c5bccc663a5a7d530f81925d8bc10 > change-id: 20230328-soc-mailbox-3cb6bb2b0b2d > > Best regards, > -- > Hector Martin <marcan@marcan.st> > > Series LGTM. Acked-by: Neal Gompa <neal@gompa.dev> -- 真実はいつも一つ!/ Always, there's only one truth!
On 31/03/2023 13.14, Hector Martin wrote: > On 31/03/2023 01.35, Jassi Brar wrote: >> On Wed, Mar 29, 2023 at 5:53 PM Hector Martin <marcan@marcan.st> wrote: >>> On 30/03/2023 01.04, Jassi Brar wrote: >> >>>>> On top of this, the mailbox subsystem makes design >>>>> decisions unsuitable for our use case. Its queuing implementation >>>>> has a fixed queue size and fails sends when full instead of pushing >>>>> back by blocking, which is completely unsuitable for high-traffic >>>>> mailboxes with hard reliability requirements, such as ours. We've >>>>> also run into multiple issues around using mailbox in an atomic >>>>> context (required for SMC reboot/shutdown requests). >>>>> >>>> I don't think you ever shared the issues you were struggling with. >>> >>> I did try to send a patch clarifying/cleaning up inconsistent usage of >>> the atomic codepath in other drivers, and you rejected it. At that point >>> I gave up in trying to contribute to cleaning up the existing mess, >>> because you're clearly not interested. >>> >> You mean https://lore.kernel.org/lkml/20220502090225.26478-6-marcan@marcan.st/ >> Now I see where this code-rage comes from. >> >> But let me clarify even more... >> You do not kill some api just because you don't need that and/or you >> think that is "stupid" because you can't see past your own use-case. > > It is general Linux kernel policy not to have internal APIs with zero > users. The Rust folks get pushback all the time for upstreaming stuff > piecewise even though in that case there are known, upcoming, > in-the-pipeline users (we do that too with rtkit but we always have > upcoming users downstream and we're small enough nobody notices and > complains :P). Having dead APIs that nobody uses and nobody can point at > an upcoming use case for is technical debt. That's why my first patch in > this series cleans up one of those on our side. > >>> This issue is clearly known, and it doesn't take a lot of thinking to >>> realize that *any* queue length limit coupled with hard-fails on message >>> sends instead of pushback is just unsuitable for many use cases. Maybe >>> all existing mailbox users have intrinsically synchronous use cases that >>> keep the queue idle enough, or maybe they're just broken only in corner >>> cases that haven't come back to the mailbox subsystem yet. Either way, >>> as far as I'm concerned this is broken by design in the general case. >>> >> You may be surprised but I do understand hardcoding limits on buffer >> size is taboo.... unless benefits outweigh fingerpointing :) > > Using a fixed size buffer is not the problem, having no blocking > mechanism when it gets full is the problem. > >> 2) The api relies on last_tx_done() to make sure we submit data only >> when we have an all-clear ... > > That's not the issue, the issue is putting stuff *into* the queue, not > taking it *out* of the queue and sending it to the hardware. > >> which is a platform specific way to >> ensure signal will physically reach the remote (whether data is >> accepted or not is upto the upper layer protocol and that's why it is >> recommended to pass pointer to data, rather than data as the signal). >> The way api is recommended (not required) to be used, the limit on >> TX_QUEUE_LEN(20), is not impactful beyond the fifo size of the >> controller. Though I am open to idea of seeing if tx failure should be >> considered a possiblity even after last_tx_done. > > If I do this: > > for (int i = 0; i < 30; i++) { > mbox_send_message(...); > } > > Then, unless the remote is fast enough to accept messages faster than > the CPU can send them, some of those sends will fail and refuse to send > data, once the subsystem side queue is full. > > That is broken because it either loses data or forces the user to > implement retry poll loops, neither of which is appropriate. The mailbox > subsystem knows when the hardware can send data, it can properly block > the send on that signal (which is exactly what my refactor in this > series does when the hardware queue gets full). > > If we instead wait for the tx completion stuff before sending, then that > defeats the point of having a queue because you'd be waiting for each > prior message before sending a new one. And then you need to keep track > of the last completion. And it requires a global serialization on the > client side anyway unless you can guarantee you have less than > QUEUE_SIZE clients. And you still have the issue of having to keep the > message data around until that completion fires, which is more code and > allocator overhead over just passing it inline, since it's a tiny amount > of data. Etc etc etc. > > It is a bad API, using it properly and reliably requires basically > re-implementing part of the subsystem logic in the consumer to work > around the issues. > >> Iirc on lkml, people have reported using 1000s tx calls per second >> within this queue limit. I don't know how you tried to interpret that >> limit but would have helped to know your issue. > > For reference: Our GPU benchmarks will easily hit 18000+ TX calls per > second through mailbox, even more for some corner cases (this is why I > want to implement coalescing when the HW queue already has identical > doorbells, to reduce that). More importantly, although the GPU side > firmware is usually fast at processing this (it has an IRQ handler and > its own doorbell coalescing), when GPU faults or errors happen it has > latency spikes, and then we *must* block mailbox sends until it is ready > to handle messages again. Dropping messages on the floor is not an > option. This *has* to be 100% reliable or users' machines crash. > >>> >>>> But if redoing mailbox overall saves you complexity, I am ok with it. >>> >>> Is that an ack? :-) >>> >> You sound like being trapped in a bad marriage with mailbox :) And >> I really don't want you to stay in a rewardless situation --- I have >> actually asked some platforms during RFCs if mailbox is really useful >> for them (usually SMC/HVC based useage), but they found use. > >> Please make sure it is not just code-rage of your patchset being >> rejected, and indeed there are things you can't do with the api. > > It isn't. There's no code rage here, that patch was a long time ago. > What that patch told me was that cleaning up mailbox to work for us was > going to be an uphill battle, and then over the course of the year+ > after that it has become very evident that there is a lot of work to do > to make mailbox work for us. Hence, the conclusion that we're better off > without. Honestly, at this point, even without that rejection I'd still > want to move away because there's just so much work to do to get all the > features we need and bugs we're hitting fixed and no realistic way to > test other consumers/drivers to make sure we don't break them in the > process. > >> Because the api can not have Zero impact on any platform's >> implementation and my ack here could be used as a precedent for every >> platform to implement their own tx/rx queues and dt handling and move >> into drivers/soc/. > > As I said, there's a very clear sign here that this is the right move: > the overall code size goes down. After this series we have: > > - Less code in total (much less actually executing) > - That works better > - And is easier to understand and debug > - And requires less maintenance effort to improve > > If other platforms come to the same conclusion for their use case then > yes, they should move away from mailbox as well. I would expect that > might be the case for a subset, not all, of users. If more users follow, > that should be a sign to you that the mailbox subsystem isn't as useful > as you'd like :) > > Put another way: common code should actually save you lines of code. If > it's causing you to spend more lines of code to use it properly than it > saves, it is not useful and does not actually improve the situation. > >> A couple years later someone will see it doesn't> make sense every> platform is doing the same thing in driver/soc/ and >> maybe it s a good idea to have some drivers/mailbox/ to hold the >> common code. > > If they are really doing the same thing, sure. And then might be a good > time to re-think mailbox and what it should do and how it should offer > this common code to drivers, in a way that works for more users and > actually saves everyone time and maintenance effort, with less burden. > >> I am also aware I am just a volunteer at mailbox and can not dictate >> what you do with your platform. So, is there anything like >> Neither-acked-nor-objected-but-left-to-soc-by ? ;) > > Not really, because it's your subsystem so we do actually need you to > ack the driver deletion patch if it's going to go through our tree. > That's the rules. "Acked" doesn't mean "I am happy with this", it means > "I am okay with this" ;) Ping? :-) - Hector
Once upon a time, Apple machines had some mailbox hardware, and we had to write a driver for it. And since it was a mailbox, it felt natural to use the Linux mailbox subsystem. More than a year later, it has become evident that was not the right decision. Linux driver class subsystems generally exist for a few purposes: 1. To allow mixing and matching generic producers and consumers. 2. To implement common code that is likely to be shared across drivers, and do so correctly so correct code only has to be written once. 3. To force drivers into a "correct" design, such that driver authors avoid common pitfalls. The mailbox subsystem does not do any of the above for us: 1. Mailbox hardware is not generic; "mailbox" is a vague idea, not a standard for communication. Almost all mailbox consumers are tied to one or a few producers. There is practically no mixing and matching possible. We have one (1) consumer subsystem (RTKit) talking to one (1) mailbox driver (apple-mailbox). We might have a second consumer in the future (SEP), but there will still be no useful combinatronics with other drivers. 2. The mailbox subsystem implements a bunch of common code for queuing, but we don't need that because our hardware has hardware queues. It also implements a bunch of common code for supporting multiple channels, but we don't need that because our hardware only has one channel (it has "endpoints" but those are just tags that are part of the message). On top of this, the mailbox subsystem makes design decisions unsuitable for our use case. Its queuing implementation has a fixed queue size and fails sends when full instead of pushing back by blocking, which is completely unsuitable for high-traffic mailboxes with hard reliability requirements, such as ours. We've also run into multiple issues around using mailbox in an atomic context (required for SMC reboot/shutdown requests). 3. Mailbox doesn't really do much for us as far as driver design. If anything, it has been forcing us to add extra workarounds for the impedance mismatches between the subsystem core and the hardware. Other drivers already are inconsistent in how they use the mailbox core, because the documentation is unclear on various details. The interface for mailboxes is very simple - basically just a send() and a receive callback. This series quite literally just rips out the middleman, and connects both sides together directly. There just isn't anything useful the mailbox common code is doing for us - it's just a pile of complexity in the middle that adds bugs, impedance mismatches, overhead, and offers no extra features we can use. This series offers: - A modest reduction in overall code size (-27 net lines excluding #1). - Fixes a pile of bugs related to using the mailbox subsystem and its quirks and limitations (race conditions when messages are already in the queue on startup, atomic issues, the list goes on). - Adds runtime-PM support. - Adds support for the FIFOs in the mailbox hardware, improving performance. - Simplifies code by removing extraneous memory allocations (the mailbox subsystem requires consumers to pass pointers to messages, instead of inlining them, even though messages are usually only one or two registers in size) and the requisite cleanup/freeing in the completion path. In addition, it paves the way for future Apple-specific mailbox optimizations, such as adding the ability to de-duplicate message sends if the same message is already known to be in the FIFO (which can be done by keeping a rolling history of recently sent messages). This is useful for doorbell-style messages, which are redundant to send more than once if not yet processed. Apple Silicon platforms use these mailboxes pervasively, including as part of the GPU submission hot path. On top of that, bad interactions with firmware coprocessors can cause immediate lockups or crashes with no recovery possible but a reboot. Our requirements for reliability and performance are probably much higher than the average mailbox user, and we'd much rather not have a bunch of common code getting in the way of performance profiling and future optimization. It doesn't make much sense for the mailbox subsystem either, since solving these issues would require major refactoring that is unlikely to provide significant benefit to most other users. So let's just call usage of the mailbox subsystem a failed experiment, and move the driver into soc/apple, where we can control the API and can add whatever peculiarities we need for our mailboxes. Farewell, mailbox. There are no changes to the DT bindings. This driver has been shipping in Asahi stable kernel packages for a week, with no regressions reported by any users. As an additional non-kernel-related benefit, this introduces a direct module dependency between consumers and the mailbox producer. This, in turn, is in the critical path for the NVMe driver on these platforms. Prior to this series, we had to have custom initramfs hooks to add apple-mailbox to distro initramfses, and accidentally removing these hooks would result in a completely unbootable system (there is no way for standard initramfs machinery to detect soft consumer/producer relationships like this, they usually just go looking for block device modules and their direct dependencies). We still need the initramfs hooks for other things, but with this change, completely removing all Apple-related initramfs hooks will at least result in a *bootable* system so you can fix the problem. This has already bit several users, and it also means many more distros have a chance of working out of the box (enough to let you install extra stuff) on these platforms, instead of having a hard requirement that *every single distro* fix up their initramfs generation in order to even boot/install on these platforms at all. Jassi: Ideally I'd like to get an ack on this and merge it all through asahi-soc, so we don't have to play things patch-by-patch across multiple merge cycles to avoid potentially broken intermediate states. Signed-off-by: Hector Martin <marcan@marcan.st> --- Hector Martin (5): soc: apple: rtkit: Get rid of apple_rtkit_send_message_wait soc: apple: mailbox: Add ASC/M3 mailbox driver soc: apple: rtkit: Port to the internal mailbox driver mailbox: apple: Delete driver soc: apple: mailbox: Rename config symbol to APPLE_MAILBOX MAINTAINERS | 2 - drivers/mailbox/Kconfig | 12 - drivers/mailbox/Makefile | 2 - drivers/mailbox/apple-mailbox.c | 441 ------------------------------------- drivers/soc/apple/Kconfig | 15 +- drivers/soc/apple/Makefile | 3 + drivers/soc/apple/mailbox.c | 434 ++++++++++++++++++++++++++++++++++++ drivers/soc/apple/mailbox.h | 48 ++++ drivers/soc/apple/rtkit-internal.h | 8 +- drivers/soc/apple/rtkit.c | 133 +++-------- include/linux/apple-mailbox.h | 19 -- include/linux/soc/apple/rtkit.h | 18 -- 12 files changed, 529 insertions(+), 606 deletions(-) --- base-commit: bdfe6de2695c5bccc663a5a7d530f81925d8bc10 change-id: 20230328-soc-mailbox-3cb6bb2b0b2d Best regards,