Message ID | 20231207222944.663893-8-pierre-louis.bossart@linux.intel.com (mailing list archive) |
---|---|
State | RFC |
Headers | show |
Series | soundwire/ASoC: speed-up downloads with BTP/BRA protocol | expand |
On Thu, Dec 07, 2023 at 04:29:35PM -0600, Pierre-Louis Bossart wrote: > This patch suggests a minimum bound of 64 bytes, for smaller transfers 128 in the code. > +int sdw_bpt_close_stream(struct sdw_bus *bus, > + struct sdw_slave *slave, > + enum sdw_bpt_type mode, > + struct sdw_bpt_msg *msg) > +{ Seems weird to pass the message to close? > +int sdw_bpt_open_stream(struct sdw_bus *bus, > + struct sdw_slave *slave, > + enum sdw_bpt_type mode, > + struct sdw_bpt_msg *msg) Ditto, here does it make sense to pass the msg to open? I guess the idea is that one only sends a single message in one open->send->wait->close cycle? Would be much nicer if multiple send/waits could be done in a single open/close, or if the limitation is unavoidable why split out open/send into separate calls at all? Just have send and wait and wrap open/close into them. > + if (msg->len % SDW_BPT_MSG_BYTE_ALIGNMENT) { > + dev_err(bus->dev, "BPT message length %d is not a multiple of %d bytes\n", > + msg->len, SDW_BPT_MSG_BYTE_ALIGNMENT); > + return -EINVAL; > + } Should this really be here? My understanding is this alignment is a limitation of specific hardware so should this check not be in the Cadence master code. > +int sdw_bpt_send_async(struct sdw_bus *bus, > + struct sdw_slave *slave, > + struct sdw_bpt_msg *msg) > +{ > + if (msg->len > SDW_BPT_MSG_MAX_BYTES) > + return -EINVAL; Does this check make sense since this was already checked in open? I guess the user could pass in a different message at this stage, but that I guess is part of what feels weird about passing the message into open. > +/** > + * struct sdw_btp_msg - Message structure > + * @addr: Start Register address accessed in the Slave > + * @len: number of bytes to transfer. More than 64Kb can be transferred > + * but a practical limit of SDW_BPT_MSG_MAX_BYTES is enforced. Where is the 64kb coming from here? > +/* > + * Add a minimum number of bytes for BPT transfer sizes. BPT has a lot of > + * overhead, enabling/disabling a stream costs 6 write commands, plus the bank > + * switch that could be delayed in time. In addition, transferring very small > + * data sets over DMA is known to be problematic on multiple platforms. > + */ > +#define SDW_BPT_MSG_MIN_BYTES 128 > + Is it really necessary for the core to enforce a minimum transfer size (well except for the required alignment thing)? Yes small transfers don't obviously make sense, but there is nothing inherently wrong with doing one, which makes me feel it is excessive for the core to block more than it has to here. I would be of the opinion its up to driver writers if they have some reason to do small BRA transfers. Firstly, since we are so keen on allowing BRA and normal writes to overlap, one could see use-cases when you want to do something through BRA such that it doesn't block the normal command stream. Also there is already 1 feature on cs42l43 that can only be accessed through BRA, yes that is a heroically questionable hardware design choice. Whilst we are mostly ignoring that for now, I can see this being something some other hardware teams decide to heroically do at some point as well. Thanks, Charles
On 12/12/23 06:19, Charles Keepax wrote: > On Thu, Dec 07, 2023 at 04:29:35PM -0600, Pierre-Louis Bossart wrote: >> This patch suggests a minimum bound of 64 bytes, for smaller transfers > > 128 in the code. indeed, that's a left-over from an earlier trial. Will fix. >> +int sdw_bpt_close_stream(struct sdw_bus *bus, >> + struct sdw_slave *slave, >> + enum sdw_bpt_type mode, >> + struct sdw_bpt_msg *msg) >> +{ > > Seems weird to pass the message to close? It's not necessary in my current solution, but I opted to keep the arguments aligned. >> +int sdw_bpt_open_stream(struct sdw_bus *bus, >> + struct sdw_slave *slave, >> + enum sdw_bpt_type mode, >> + struct sdw_bpt_msg *msg) > > Ditto, here does it make sense to pass the msg to open? I guess > the idea is that one only sends a single message in one > open->send->wait->close cycle? Would be much nicer if multiple > send/waits could be done in a single open/close, or if the > limitation is unavoidable why split out open/send into separate > calls at all? Just have send and wait and wrap open/close into > them. it's needed for the Intel solution, the open() will allocate the required DMA buffers and copy the data from the messsage into the DMA buffers with the required formatting expected by the IP. An alternative would be to simplify open/close but then we have to add a hw_params/prepare and hw_free steps. I didn't see a need for such a split, unlike for audio the arguments are not going to be variable. But yes it's a good question, what exactly is an open/startup callback supposed to do in ALSA/ASoC? > >> + if (msg->len % SDW_BPT_MSG_BYTE_ALIGNMENT) { >> + dev_err(bus->dev, "BPT message length %d is not a multiple of %d bytes\n", >> + msg->len, SDW_BPT_MSG_BYTE_ALIGNMENT); >> + return -EINVAL; >> + } > > Should this really be here? My understanding is this alignment is > a limitation of specific hardware so should this check not be in > the Cadence master code. As discussed earlier, we *could* move this to host-specific parts, but then the codec driver would need to know about host alignment requirements. One of those 'be careful what you ask for' things where you may have more work to do on the codec driver side... >> +int sdw_bpt_send_async(struct sdw_bus *bus, >> + struct sdw_slave *slave, >> + struct sdw_bpt_msg *msg) >> +{ >> + if (msg->len > SDW_BPT_MSG_MAX_BYTES) >> + return -EINVAL; > > Does this check make sense since this was already checked in > open? I guess the user could pass in a different message at this > stage, but that I guess is part of what feels weird about passing > the message into open. The error check is a big open, we could assume that all the previous steps were done the right way and skip tests, or we would add a set of paranoia checks. The primary intent of those checks is to help the integration of codec drivers, it's better to get an error code and a clear error message than a kernel oops because the expected sequence is not followed. Once the integration is done, those tests are probably not very useful indeed. >> +/** >> + * struct sdw_btp_msg - Message structure >> + * @addr: Start Register address accessed in the Slave >> + * @len: number of bytes to transfer. More than 64Kb can be transferred >> + * but a practical limit of SDW_BPT_MSG_MAX_BYTES is enforced. > > Where is the 64kb coming from here? Ah, this is a reference to the 16 bit address limitation in the regular read/write commands. >> +/* >> + * Add a minimum number of bytes for BPT transfer sizes. BPT has a lot of >> + * overhead, enabling/disabling a stream costs 6 write commands, plus the bank >> + * switch that could be delayed in time. In addition, transferring very small >> + * data sets over DMA is known to be problematic on multiple platforms. >> + */ >> +#define SDW_BPT_MSG_MIN_BYTES 128 >> + > > Is it really necessary for the core to enforce a minimum transfer > size (well except for the required alignment thing)? Yes small > transfers don't obviously make sense, but there is nothing inherently > wrong with doing one, which makes me feel it is excessive for the > core to block more than it has to here. I think it's really better to have a clear design intent that BRA is not meant for small transfers. It would be opening a can of worms if we start seeing uses of this protocol beyond the intended firmware/table download and history buffer retrieval. > I would be of the opinion its up to driver writers if they have > some reason to do small BRA transfers. Firstly, since we are so > keen on allowing BRA and normal writes to overlap, one could see > use-cases when you want to do something through BRA such that it > doesn't block the normal command stream. Also there is already 1 > feature on cs42l43 that can only be accessed through BRA, yes that > is a heroically questionable hardware design choice. Whilst we are > mostly ignoring that for now, I can see this being something some > other hardware teams decide to heroically do at some point as well. To be clear, I would like to allow BRA in parallel with normal writes *to a different set of registers* to deal with alerts, etc. A set of registers only accessible through BRA is a very questionable design indeed. That's not what the SoundWire spec intended and it's not what this patchset had in mind. You'll have to tell us more on what exactly the hardware is and does, and how strongly you want this capability supported...
On 07-12-23, 16:29, Pierre-Louis Bossart wrote: > Add definitions and helpers for the BPT/BRA protocol. Peripheral > drivers (aka ASoC codec drivers) can use this API to send bulk data > such as firmware or tables. > > The API is only available when no other audio streams have been > allocated, and only one BTP/BRA stream is allowed per link. To avoid > the addition of yet another lock, the refcount tests are handled in > the stream master_runtime alloc/free routines where the bus_lock is > already held. Another benefit of this approach is that the same > bus_lock is used to handle runtime and port linked lists, which > reduces the potential for misaligned configurations. > > In addition to exclusion with audio streams, BPT transfers have a lot > of overhead, specifically registers writes are needed to enable > transport in DP0. In addition, most DMAs don't handle too well very > small data sets. > > This patch suggests a minimum bound of 64 bytes, for smaller transfers > codec drivers should rely on the regular read/write commands in > Column0. > > Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> > --- > drivers/soundwire/bus.c | 77 +++++++++++++++++++++++++++++++++++ > drivers/soundwire/bus.h | 18 ++++++++ > drivers/soundwire/stream.c | 30 ++++++++++++++ > include/linux/soundwire/sdw.h | 76 ++++++++++++++++++++++++++++++++++ > 4 files changed, 201 insertions(+) > > diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c > index f3fec15c3112..e5758d2ed88f 100644 > --- a/drivers/soundwire/bus.c > +++ b/drivers/soundwire/bus.c > @@ -2015,3 +2015,80 @@ void sdw_clear_slave_status(struct sdw_bus *bus, u32 request) > } > } > EXPORT_SYMBOL(sdw_clear_slave_status); > + > +int sdw_bpt_close_stream(struct sdw_bus *bus, > + struct sdw_slave *slave, > + enum sdw_bpt_type mode, > + struct sdw_bpt_msg *msg) > +{ > + int ret; > + > + ret = bus->ops->bpt_close_stream(bus, slave, mode, msg); > + if (ret < 0) > + dev_err(bus->dev, "BPT stream close, err %d\n", ret); > + > + return ret; > +} > +EXPORT_SYMBOL(sdw_bpt_close_stream); > + > +int sdw_bpt_open_stream(struct sdw_bus *bus, > + struct sdw_slave *slave, > + enum sdw_bpt_type mode, > + struct sdw_bpt_msg *msg) > +{ > + int ret; > + > + /* only Bulk Register Access (BRA) is supported for now */ > + if (mode != SDW_BRA) > + return -EINVAL; > + > + if (msg->len < SDW_BPT_MSG_MIN_BYTES) { > + dev_err(bus->dev, "BPT message length %d, min supported %d\n", > + msg->len, SDW_BPT_MSG_MIN_BYTES); > + return -EINVAL; > + } > + > + if (msg->len % SDW_BPT_MSG_BYTE_ALIGNMENT) { > + dev_err(bus->dev, "BPT message length %d is not a multiple of %d bytes\n", > + msg->len, SDW_BPT_MSG_BYTE_ALIGNMENT); > + return -EINVAL; > + } Is this a protocol requirement? > + > + /* check device is enumerated */ > + if (slave->dev_num == SDW_ENUM_DEV_NUM || > + slave->dev_num > SDW_MAX_DEVICES) > + return -ENODEV; > + > + /* make sure all callbacks are defined */ > + if (!bus->ops->bpt_open_stream || > + !bus->ops->bpt_close_stream || > + !bus->ops->bpt_send_async || > + !bus->ops->bpt_wait) > + return -ENOTSUPP; should this not be checked at probe time, if device declares the support > + > + ret = bus->ops->bpt_open_stream(bus, slave, mode, msg); > + if (ret < 0) > + dev_err(bus->dev, "BPT stream open, err %d\n", ret); > + > + return ret; > +} > +EXPORT_SYMBOL(sdw_bpt_open_stream); can we open multiple times (i dont see a check preventing that), how do we close..? Re-iterating my comment on documentation patch, can we do with a async api and wait api, that makes symantics a lot simpler, right..? > + > +int sdw_bpt_send_async(struct sdw_bus *bus, > + struct sdw_slave *slave, > + struct sdw_bpt_msg *msg) > +{ > + if (msg->len > SDW_BPT_MSG_MAX_BYTES) > + return -EINVAL; > + > + return bus->ops->bpt_send_async(bus, slave, msg); > +} > +EXPORT_SYMBOL(sdw_bpt_send_async); Can we call this multiple times after open, it is unclear to me. Can you please add kernel-doc comments about the APIs here as well > struct sdw_master_ops { > int (*read_prop)(struct sdw_bus *bus); > @@ -869,6 +913,20 @@ struct sdw_master_ops { > void (*new_peripheral_assigned)(struct sdw_bus *bus, > struct sdw_slave *slave, > int dev_num); > + int (*bpt_open_stream)(struct sdw_bus *bus, > + struct sdw_slave *slave, > + enum sdw_bpt_type mode, > + struct sdw_bpt_msg *msg); > + int (*bpt_close_stream)(struct sdw_bus *bus, > + struct sdw_slave *slave, > + enum sdw_bpt_type mode, > + struct sdw_bpt_msg *msg); > + int (*bpt_send_async)(struct sdw_bus *bus, > + struct sdw_slave *slave, > + struct sdw_bpt_msg *msg); > + int (*bpt_wait)(struct sdw_bus *bus, > + struct sdw_slave *slave, > + struct sdw_bpt_msg *msg); do we need both bus and slave, that was a mistake in orignal design IMO. We should fix that for bpt_ apis
>> +int sdw_bpt_open_stream(struct sdw_bus *bus, >> + struct sdw_slave *slave, >> + enum sdw_bpt_type mode, >> + struct sdw_bpt_msg *msg) >> +{ >> + int ret; >> + >> + /* only Bulk Register Access (BRA) is supported for now */ >> + if (mode != SDW_BRA) >> + return -EINVAL; >> + >> + if (msg->len < SDW_BPT_MSG_MIN_BYTES) { >> + dev_err(bus->dev, "BPT message length %d, min supported %d\n", >> + msg->len, SDW_BPT_MSG_MIN_BYTES); >> + return -EINVAL; >> + } >> + >> + if (msg->len % SDW_BPT_MSG_BYTE_ALIGNMENT) { >> + dev_err(bus->dev, "BPT message length %d is not a multiple of %d bytes\n", >> + msg->len, SDW_BPT_MSG_BYTE_ALIGNMENT); >> + return -EINVAL; >> + } > > Is this a protocol requirement? No, it's an implementation requirement. We could move this to host-specific parts but then the codec drivers will have to know about alignment requirements for each host they are use with. IOW, it's more work for codec drivers if we don't have a minimum bar for alignment requirement across all platforms. > >> + >> + /* check device is enumerated */ >> + if (slave->dev_num == SDW_ENUM_DEV_NUM || >> + slave->dev_num > SDW_MAX_DEVICES) >> + return -ENODEV; >> + >> + /* make sure all callbacks are defined */ >> + if (!bus->ops->bpt_open_stream || >> + !bus->ops->bpt_close_stream || >> + !bus->ops->bpt_send_async || >> + !bus->ops->bpt_wait) >> + return -ENOTSUPP; > > should this not be checked at probe time, if device declares the support sdw_bpt_open_stream() would be called by the peripheral driver (or regmap as a proxy). The peripheral driver could also decide to check for those callback during its probe, but that's beyond the scope of this patchset. These checks are just there for paranoia, in case a peripheral driver uses BTP/BRA on a host where they are not supported. It's not science-fiction, we see AMD- and INTEL-based platforms using the same SoundWire-based codecs. >> + ret = bus->ops->bpt_open_stream(bus, slave, mode, msg); >> + if (ret < 0) >> + dev_err(bus->dev, "BPT stream open, err %d\n", ret); >> + >> + return ret; >> +} >> +EXPORT_SYMBOL(sdw_bpt_open_stream); > > can we open multiple times (i dont see a check preventing that), how do > we close..? there's a refcount preventing multiples BTP streams from being opened. > Re-iterating my comment on documentation patch, can we do with a async api > and wait api, that makes symantics a lot simpler, right..? see reply in previous email, combining open+send is weird IMHO. >> + >> +int sdw_bpt_send_async(struct sdw_bus *bus, >> + struct sdw_slave *slave, >> + struct sdw_bpt_msg *msg) >> +{ >> + if (msg->len > SDW_BPT_MSG_MAX_BYTES) >> + return -EINVAL; >> + >> + return bus->ops->bpt_send_async(bus, slave, msg); >> +} >> +EXPORT_SYMBOL(sdw_bpt_send_async); > > Can we call this multiple times after open, it is unclear to me. Can you > please add kernel-doc comments about the APIs here as well This can be called multiple times but it's useless: all the buffers are prepared in the open() stage. This is the moral equivalent of a trigger step, just enable data transfers. > >> struct sdw_master_ops { >> int (*read_prop)(struct sdw_bus *bus); >> @@ -869,6 +913,20 @@ struct sdw_master_ops { >> void (*new_peripheral_assigned)(struct sdw_bus *bus, >> struct sdw_slave *slave, >> int dev_num); >> + int (*bpt_open_stream)(struct sdw_bus *bus, >> + struct sdw_slave *slave, >> + enum sdw_bpt_type mode, >> + struct sdw_bpt_msg *msg); >> + int (*bpt_close_stream)(struct sdw_bus *bus, >> + struct sdw_slave *slave, >> + enum sdw_bpt_type mode, >> + struct sdw_bpt_msg *msg); >> + int (*bpt_send_async)(struct sdw_bus *bus, >> + struct sdw_slave *slave, >> + struct sdw_bpt_msg *msg); >> + int (*bpt_wait)(struct sdw_bus *bus, >> + struct sdw_slave *slave, >> + struct sdw_bpt_msg *msg); > > do we need both bus and slave, that was a mistake in orignal design IMO. > We should fix that for bpt_ apis No disagreement. All the routines follow the same template, if we change one we should also change the others. The main question as discussed with Charles is whether we want to pass the 'msg' argument in all routines.
On Mon, Dec 18, 2023 at 02:12:36PM +0100, Pierre-Louis Bossart wrote: > > Is this a protocol requirement? > > No, it's an implementation requirement. > > We could move this to host-specific parts but then the codec drivers > will have to know about alignment requirements for each host they are > use with. IOW, it's more work for codec drivers if we don't have a > minimum bar for alignment requirement across all platforms. > I do certainly see that side of the argument and it does probably warrant some thought as to how a slave might learn the alignment requirements. I guess maybe some sort of core helper function to return the alignment? Or putting it in properties the slave can access? One could even keep the check here, but just pull the value from something system specific. The danger with putting it in the core is IMHO: a) It rules out certain use-cases, generally I think its a bad idea if the framework design prohibits stuff the underlying bus could do because someone will, at some point, want to do it. b) The core limit could get a bit out of hand once more controllers are added. The core limit needs to be a multiple of all the controller limits, if a controller comes along with a weird alignment requirement, that gets problematic fast. Thanks, Charles
On 12/18/23 08:57, Charles Keepax wrote: > On Mon, Dec 18, 2023 at 02:12:36PM +0100, Pierre-Louis Bossart wrote: >>> Is this a protocol requirement? >> >> No, it's an implementation requirement. >> >> We could move this to host-specific parts but then the codec drivers >> will have to know about alignment requirements for each host they are >> use with. IOW, it's more work for codec drivers if we don't have a >> minimum bar for alignment requirement across all platforms. >> > > I do certainly see that side of the argument and it does probably > warrant some thought as to how a slave might learn the alignment > requirements. I guess maybe some sort of core helper function to > return the alignment? Or putting it in properties the slave can > access? One could even keep the check here, but just pull the > value from something system specific. > > The danger with putting it in the core is IMHO: > > a) It rules out certain use-cases, generally I think its a bad > idea if the framework design prohibits stuff the underlying bus > could do because someone will, at some point, want to do it. SoundWire has lots of fancy and borderline nebulous concepts, my take is "let's do few things and do them well". We can always revisit new usages later, for now my main objective is "speed up downloads". > b) The core limit could get a bit out of hand once more > controllers are added. The core limit needs to be a multiple of > all the controller limits, if a controller comes along with a > weird alignment requirement, that gets problematic fast. I don't have any information on other controllers, but I wouldn't be surprised if most of the quirks are due to peripheral limitations and ambiguous interpretations of what 'ACK' means. I tried writing to reserved parts of the memory or non-existent registers and nothing bad was reported...
On 18-12-23, 14:12, Pierre-Louis Bossart wrote: > > >> +int sdw_bpt_open_stream(struct sdw_bus *bus, > >> + struct sdw_slave *slave, > >> + enum sdw_bpt_type mode, > >> + struct sdw_bpt_msg *msg) > >> +{ > >> + int ret; > >> + > >> + /* only Bulk Register Access (BRA) is supported for now */ > >> + if (mode != SDW_BRA) > >> + return -EINVAL; > >> + > >> + if (msg->len < SDW_BPT_MSG_MIN_BYTES) { > >> + dev_err(bus->dev, "BPT message length %d, min supported %d\n", > >> + msg->len, SDW_BPT_MSG_MIN_BYTES); > >> + return -EINVAL; > >> + } > >> + > >> + if (msg->len % SDW_BPT_MSG_BYTE_ALIGNMENT) { > >> + dev_err(bus->dev, "BPT message length %d is not a multiple of %d bytes\n", > >> + msg->len, SDW_BPT_MSG_BYTE_ALIGNMENT); > >> + return -EINVAL; > >> + } > > > > Is this a protocol requirement? > > No, it's an implementation requirement. > > We could move this to host-specific parts but then the codec drivers > will have to know about alignment requirements for each host they are > use with. IOW, it's more work for codec drivers if we don't have a > minimum bar for alignment requirement across all platforms. > > > > >> + > >> + /* check device is enumerated */ > >> + if (slave->dev_num == SDW_ENUM_DEV_NUM || > >> + slave->dev_num > SDW_MAX_DEVICES) > >> + return -ENODEV; > >> + > >> + /* make sure all callbacks are defined */ > >> + if (!bus->ops->bpt_open_stream || > >> + !bus->ops->bpt_close_stream || > >> + !bus->ops->bpt_send_async || > >> + !bus->ops->bpt_wait) > >> + return -ENOTSUPP; > > > > should this not be checked at probe time, if device declares the support > > sdw_bpt_open_stream() would be called by the peripheral driver (or > regmap as a proxy). The peripheral driver could also decide to check for > those callback during its probe, but that's beyond the scope of this > patchset. I would think that it is better to have capablities registered by the driver and those are checked at registration, so we know if bpt is supported or not for a particular platform. This make more sense to me as some driver, depending on the SoC may or maynot support this, so easy way would be to turn off caps, what do you think? > > These checks are just there for paranoia, in case a peripheral driver > uses BTP/BRA on a host where they are not supported. > > It's not science-fiction, we see AMD- and INTEL-based platforms using > the same SoundWire-based codecs. Ofcourse, it is entrely reasonable thing to do, event across x86/arm64 > > >> + ret = bus->ops->bpt_open_stream(bus, slave, mode, msg); > >> + if (ret < 0) > >> + dev_err(bus->dev, "BPT stream open, err %d\n", ret); > >> + > >> + return ret; > >> +} > >> +EXPORT_SYMBOL(sdw_bpt_open_stream); > > > > can we open multiple times (i dont see a check preventing that), how do > > we close..? > > there's a refcount preventing multiples BTP streams from being opened. > > > Re-iterating my comment on documentation patch, can we do with a async api > > and wait api, that makes symantics a lot simpler, right..? > > see reply in previous email, combining open+send is weird IMHO. > > >> + > >> +int sdw_bpt_send_async(struct sdw_bus *bus, > >> + struct sdw_slave *slave, > >> + struct sdw_bpt_msg *msg) > >> +{ > >> + if (msg->len > SDW_BPT_MSG_MAX_BYTES) > >> + return -EINVAL; > >> + > >> + return bus->ops->bpt_send_async(bus, slave, msg); > >> +} > >> +EXPORT_SYMBOL(sdw_bpt_send_async); > > > > Can we call this multiple times after open, it is unclear to me. Can you > > please add kernel-doc comments about the APIs here as well > > This can be called multiple times but it's useless: all the buffers are > prepared in the open() stage. This is the moral equivalent of a trigger > step, just enable data transfers. > > > > >> struct sdw_master_ops { > >> int (*read_prop)(struct sdw_bus *bus); > >> @@ -869,6 +913,20 @@ struct sdw_master_ops { > >> void (*new_peripheral_assigned)(struct sdw_bus *bus, > >> struct sdw_slave *slave, > >> int dev_num); > >> + int (*bpt_open_stream)(struct sdw_bus *bus, > >> + struct sdw_slave *slave, > >> + enum sdw_bpt_type mode, > >> + struct sdw_bpt_msg *msg); > >> + int (*bpt_close_stream)(struct sdw_bus *bus, > >> + struct sdw_slave *slave, > >> + enum sdw_bpt_type mode, > >> + struct sdw_bpt_msg *msg); > >> + int (*bpt_send_async)(struct sdw_bus *bus, > >> + struct sdw_slave *slave, > >> + struct sdw_bpt_msg *msg); > >> + int (*bpt_wait)(struct sdw_bus *bus, > >> + struct sdw_slave *slave, > >> + struct sdw_bpt_msg *msg); > > > > do we need both bus and slave, that was a mistake in orignal design IMO. > > We should fix that for bpt_ apis > > No disagreement. All the routines follow the same template, if we change > one we should also change the others. > > The main question as discussed with Charles is whether we want to pass > the 'msg' argument in all routines. Lets revisit when we have new API
diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c index f3fec15c3112..e5758d2ed88f 100644 --- a/drivers/soundwire/bus.c +++ b/drivers/soundwire/bus.c @@ -2015,3 +2015,80 @@ void sdw_clear_slave_status(struct sdw_bus *bus, u32 request) } } EXPORT_SYMBOL(sdw_clear_slave_status); + +int sdw_bpt_close_stream(struct sdw_bus *bus, + struct sdw_slave *slave, + enum sdw_bpt_type mode, + struct sdw_bpt_msg *msg) +{ + int ret; + + ret = bus->ops->bpt_close_stream(bus, slave, mode, msg); + if (ret < 0) + dev_err(bus->dev, "BPT stream close, err %d\n", ret); + + return ret; +} +EXPORT_SYMBOL(sdw_bpt_close_stream); + +int sdw_bpt_open_stream(struct sdw_bus *bus, + struct sdw_slave *slave, + enum sdw_bpt_type mode, + struct sdw_bpt_msg *msg) +{ + int ret; + + /* only Bulk Register Access (BRA) is supported for now */ + if (mode != SDW_BRA) + return -EINVAL; + + if (msg->len < SDW_BPT_MSG_MIN_BYTES) { + dev_err(bus->dev, "BPT message length %d, min supported %d\n", + msg->len, SDW_BPT_MSG_MIN_BYTES); + return -EINVAL; + } + + if (msg->len % SDW_BPT_MSG_BYTE_ALIGNMENT) { + dev_err(bus->dev, "BPT message length %d is not a multiple of %d bytes\n", + msg->len, SDW_BPT_MSG_BYTE_ALIGNMENT); + return -EINVAL; + } + + /* check device is enumerated */ + if (slave->dev_num == SDW_ENUM_DEV_NUM || + slave->dev_num > SDW_MAX_DEVICES) + return -ENODEV; + + /* make sure all callbacks are defined */ + if (!bus->ops->bpt_open_stream || + !bus->ops->bpt_close_stream || + !bus->ops->bpt_send_async || + !bus->ops->bpt_wait) + return -ENOTSUPP; + + ret = bus->ops->bpt_open_stream(bus, slave, mode, msg); + if (ret < 0) + dev_err(bus->dev, "BPT stream open, err %d\n", ret); + + return ret; +} +EXPORT_SYMBOL(sdw_bpt_open_stream); + +int sdw_bpt_send_async(struct sdw_bus *bus, + struct sdw_slave *slave, + struct sdw_bpt_msg *msg) +{ + if (msg->len > SDW_BPT_MSG_MAX_BYTES) + return -EINVAL; + + return bus->ops->bpt_send_async(bus, slave, msg); +} +EXPORT_SYMBOL(sdw_bpt_send_async); + +int sdw_bpt_wait(struct sdw_bus *bus, + struct sdw_slave *slave, + struct sdw_bpt_msg *msg) +{ + return bus->ops->bpt_wait(bus, slave, msg); +} +EXPORT_SYMBOL(sdw_bpt_wait); diff --git a/drivers/soundwire/bus.h b/drivers/soundwire/bus.h index fda6b24ac2da..936852ab3154 100644 --- a/drivers/soundwire/bus.h +++ b/drivers/soundwire/bus.h @@ -72,6 +72,24 @@ struct sdw_msg { bool page; }; +/** + * struct sdw_btp_msg - Message structure + * @addr: Start Register address accessed in the Slave + * @len: number of bytes to transfer. More than 64Kb can be transferred + * but a practical limit of SDW_BPT_MSG_MAX_BYTES is enforced. + * @dev_num: Slave device number + * @flags: transfer flags, indicate if xfer is read or write + * @buf: message data buffer (filled by host for write, filled + * by Peripheral hardware for reads) + */ +struct sdw_bpt_msg { + u32 addr; + u32 len; + u8 dev_num; + u8 flags; + u8 *buf; +}; + #define SDW_DOUBLE_RATE_FACTOR 2 #define SDW_STRM_RATE_GROUPING 1 diff --git a/drivers/soundwire/stream.c b/drivers/soundwire/stream.c index f9762610f051..b097b1a808f9 100644 --- a/drivers/soundwire/stream.c +++ b/drivers/soundwire/stream.c @@ -1189,6 +1189,20 @@ static struct sdw_master_runtime struct sdw_master_runtime *m_rt, *walk_m_rt; struct list_head *insert_after; + if (stream->type == SDW_STREAM_BPT) { + if (bus->stream_refcount > 0 || bus->bpt_stream_refcount > 0) { + dev_err(bus->dev, "%s: invalid configuration %d streams %d BPT streams\n", + __func__, bus->stream_refcount, bus->bpt_stream_refcount); + return ERR_PTR(-EBUSY); + } + } else { + if (bus->bpt_stream_refcount > 0) { + dev_err(bus->dev, "%s: BTP stream already allocated\n", + __func__); + return ERR_PTR(-EAGAIN); + } + } + m_rt = kzalloc(sizeof(*m_rt), GFP_KERNEL); if (!m_rt) return NULL; @@ -1217,6 +1231,8 @@ static struct sdw_master_runtime m_rt->stream = stream; bus->stream_refcount++; + if (stream->type == SDW_STREAM_BPT) + bus->bpt_stream_refcount++; return m_rt; } @@ -1265,6 +1281,8 @@ static void sdw_master_rt_free(struct sdw_master_runtime *m_rt, list_del(&m_rt->bus_node); kfree(m_rt); + if (stream->type == SDW_STREAM_BPT) + bus->bpt_stream_refcount--; bus->stream_refcount--; } @@ -1941,6 +1959,12 @@ int sdw_stream_add_master(struct sdw_bus *bus, m_rt = sdw_master_rt_find(bus, stream); if (!m_rt) { m_rt = sdw_master_rt_alloc(bus, stream); + if (IS_ERR(m_rt)) { + ret = PTR_ERR(m_rt); + dev_err(bus->dev, "%s: Master runtime alloc failed for stream:%s: %d\n", + __func__, stream->name, ret); + goto unlock; + } if (!m_rt) { dev_err(bus->dev, "%s: Master runtime alloc failed for stream:%s\n", __func__, stream->name); @@ -2056,6 +2080,12 @@ int sdw_stream_add_slave(struct sdw_slave *slave, * So, allocate m_rt and add Slave to it. */ m_rt = sdw_master_rt_alloc(slave->bus, stream); + if (IS_ERR(m_rt)) { + ret = PTR_ERR(m_rt); + dev_err(&slave->dev, "%s: Master runtime alloc failed for stream:%s: %d\n", + __func__, stream->name, ret); + goto unlock; + } if (!m_rt) { dev_err(&slave->dev, "%s: Master runtime alloc failed for stream:%s\n", __func__, stream->name); diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h index 1dda65d91bad..e54c5bbd2b91 100644 --- a/include/linux/soundwire/sdw.h +++ b/include/linux/soundwire/sdw.h @@ -835,6 +835,46 @@ struct sdw_defer { struct sdw_msg *msg; }; +/* + * Add a practical limit to BPT transfer sizes. BPT is typically used + * to transfer firmware, and larger firmware transfers will increase + * the cold latency beyond typical OS or user requirements. + */ +#define SDW_BPT_MSG_MAX_BYTES (1024 * 1024) + +/* + * Add a minimum number of bytes for BPT transfer sizes. BPT has a lot of + * overhead, enabling/disabling a stream costs 6 write commands, plus the bank + * switch that could be delayed in time. In addition, transferring very small + * data sets over DMA is known to be problematic on multiple platforms. + */ +#define SDW_BPT_MSG_MIN_BYTES 128 + +/* + * Add an alignment requirement for BPT transfer sizes. BPT is + * typically with DMAs which are most-efficient with data aligned to a + * cache line. For example HDaudio DMAs strongly recommend to use + * multiples of 128 bytes. + * This could be a platform-specific value, but that would be problematic for + * codec drivers which are typically not platform-specific by definition. + * It's probably simpler for everyone to align on a baseline and use regular + * read/write commands for the remaining bytes. + */ +#define SDW_BPT_MSG_BYTE_ALIGNMENT 128 + +/** + * sdw_bpt_type - SoundWire Bulk Payload Transfer (BPT) type + * @SDW_BRA: Soundwire Bulk Register Access (BRA) + * + * The SoundWire specification allows for implementation-defined + * solutions, they would be added after BRA. + */ +enum sdw_bpt_type { + SDW_BRA, +}; + +struct sdw_bpt_msg; + /** * struct sdw_master_ops - Master driver ops * @read_prop: Read Master properties @@ -850,6 +890,10 @@ struct sdw_defer { * @get_device_num: Callback for vendor-specific device_number allocation * @put_device_num: Callback for vendor-specific device_number release * @new_peripheral_assigned: Callback to handle enumeration of new peripheral. + * @bpt_open_stream: reserve resources for BPT stream + * @bpt_close_stream: release resources for BPT stream + * @bpt_send_async: send message using BTP protocol + * @bpt_wait: wait for message completion using BTP protocol */ struct sdw_master_ops { int (*read_prop)(struct sdw_bus *bus); @@ -869,6 +913,20 @@ struct sdw_master_ops { void (*new_peripheral_assigned)(struct sdw_bus *bus, struct sdw_slave *slave, int dev_num); + int (*bpt_open_stream)(struct sdw_bus *bus, + struct sdw_slave *slave, + enum sdw_bpt_type mode, + struct sdw_bpt_msg *msg); + int (*bpt_close_stream)(struct sdw_bus *bus, + struct sdw_slave *slave, + enum sdw_bpt_type mode, + struct sdw_bpt_msg *msg); + int (*bpt_send_async)(struct sdw_bus *bus, + struct sdw_slave *slave, + struct sdw_bpt_msg *msg); + int (*bpt_wait)(struct sdw_bus *bus, + struct sdw_slave *slave, + struct sdw_bpt_msg *msg); }; /** @@ -905,6 +963,8 @@ struct sdw_master_ops { * synchronization will be used even if a stream only uses a single * SoundWire segment. * @stream_refcount: number of streams currently using this bus + * @btp_stream_refcount: number of BTP streams currently using this bus (should + * be zero or one, multiple streams per link is not supported). */ struct sdw_bus { struct device *dev; @@ -935,6 +995,7 @@ struct sdw_bus { bool multi_link; int hw_sync_min_links; int stream_refcount; + int bpt_stream_refcount; }; int sdw_bus_master_add(struct sdw_bus *bus, struct device *parent, @@ -1052,6 +1113,21 @@ int sdw_bus_exit_clk_stop(struct sdw_bus *bus); int sdw_compare_devid(struct sdw_slave *slave, struct sdw_slave_id id); void sdw_extract_slave_id(struct sdw_bus *bus, u64 addr, struct sdw_slave_id *id); +int sdw_bpt_open_stream(struct sdw_bus *bus, + struct sdw_slave *slave, + enum sdw_bpt_type mode, + struct sdw_bpt_msg *msg); +int sdw_bpt_close_stream(struct sdw_bus *bus, + struct sdw_slave *slave, + enum sdw_bpt_type mode, + struct sdw_bpt_msg *msg); +int sdw_bpt_send_async(struct sdw_bus *bus, + struct sdw_slave *slave, + struct sdw_bpt_msg *msg); +int sdw_bpt_wait(struct sdw_bus *bus, + struct sdw_slave *slave, + struct sdw_bpt_msg *msg); + #if IS_ENABLED(CONFIG_SOUNDWIRE) int sdw_stream_add_slave(struct sdw_slave *slave,
Add definitions and helpers for the BPT/BRA protocol. Peripheral drivers (aka ASoC codec drivers) can use this API to send bulk data such as firmware or tables. The API is only available when no other audio streams have been allocated, and only one BTP/BRA stream is allowed per link. To avoid the addition of yet another lock, the refcount tests are handled in the stream master_runtime alloc/free routines where the bus_lock is already held. Another benefit of this approach is that the same bus_lock is used to handle runtime and port linked lists, which reduces the potential for misaligned configurations. In addition to exclusion with audio streams, BPT transfers have a lot of overhead, specifically registers writes are needed to enable transport in DP0. In addition, most DMAs don't handle too well very small data sets. This patch suggests a minimum bound of 64 bytes, for smaller transfers codec drivers should rely on the regular read/write commands in Column0. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> --- drivers/soundwire/bus.c | 77 +++++++++++++++++++++++++++++++++++ drivers/soundwire/bus.h | 18 ++++++++ drivers/soundwire/stream.c | 30 ++++++++++++++ include/linux/soundwire/sdw.h | 76 ++++++++++++++++++++++++++++++++++ 4 files changed, 201 insertions(+)