Message ID | 20230126163008.3676950-1-andersson@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [GIT,PULL] Qualcomm driver updates for v6.3 | expand |
On Thu, Jan 26, 2023, at 17:30, Bjorn Andersson wrote: > The following changes since commit 6049aae52392539e505bfb8ccbcff3c26f1d2f0b: > > ---------------------------------------------------------------- > Qualcomm driver updates for v6.3 > > This introduces a new driver for the Data Capture and Compare block, > which provides a mechanism for capturing hardware state (access MMIO > registers) either upon request of triggered automatically e.g. upon a > watchdog bite, for post mortem analysis. > > The remote filesystem memory share driver gains support for having its > memory bound to more than a single VMID. > > The SCM driver gains the minimal support needed to support a new > mechanism where secure world can put calls on hold and later request > them to be retried. > > Support for the new SA8775P platform is added to rpmhpd, QDU1000 is > added to the SCM driver and a long list of platforms are added to the > socinfo driver. Support for socinfo data revision 16 is also introduced. > > Lastly a driver to program the ramp controller in MSM8976 is introduced. Hi Bjorn, I don't feel comfortable merging the DCC driver through drivers/soc/ at this point: This is the first time I see the driver and it introduces a complex user space ABI that I have no time to review as part of the merge process. I usually try to avoid adding any custom user space interfaces in drivers/soc, as these tend to be things that end up being similar to other chips and need a generic interface. In particular I don't see an explanation about how the new interface relates to the established drivers/hwtracing/ subsystem and why it shouldn't be part of that (adding the hwtracing and coresight maintainers to Cc in case they have looked at this already). Can you send an updated pull request that leaves out the DCC driver until we have clarified these points? Arnd
On Mon, Jan 30, 2023 at 04:18:45PM +0100, Arnd Bergmann wrote: > On Thu, Jan 26, 2023, at 17:30, Bjorn Andersson wrote: > > The following changes since commit 6049aae52392539e505bfb8ccbcff3c26f1d2f0b: > > > > ---------------------------------------------------------------- > > Qualcomm driver updates for v6.3 > > > > This introduces a new driver for the Data Capture and Compare block, > > which provides a mechanism for capturing hardware state (access MMIO > > registers) either upon request of triggered automatically e.g. upon a > > watchdog bite, for post mortem analysis. > > > > The remote filesystem memory share driver gains support for having its > > memory bound to more than a single VMID. > > > > The SCM driver gains the minimal support needed to support a new > > mechanism where secure world can put calls on hold and later request > > them to be retried. > > > > Support for the new SA8775P platform is added to rpmhpd, QDU1000 is > > added to the SCM driver and a long list of platforms are added to the > > socinfo driver. Support for socinfo data revision 16 is also introduced. > > > > Lastly a driver to program the ramp controller in MSM8976 is introduced. > > Hi Bjorn, > > I don't feel comfortable merging the DCC driver through drivers/soc/ > at this point: This is the first time I see the driver and it introduces > a complex user space ABI that I have no time to review as part of the > merge process. > The DCC driver has made 22 versions over the last 23 months, but now that I look back I do agree that the recipients list has been too limited. Further more, due to the complexity of the ABI I steered this towards debugfs, with the explicit mentioning that we will change the interface if needed - in particular since not a lot of review interest has been shown... > I usually try to avoid adding any custom user space interfaces > in drivers/soc, as these tend to be things that end up being > similar to other chips and need a generic interface. > I have no concern with that, but I'm not able to suggest an existing subsystem where this would fit. > In particular I don't see an explanation about how the new interface > relates to the established drivers/hwtracing/ subsystem and why it > shouldn't be part of that (adding the hwtracing and coresight > maintainers to Cc in case they have looked at this already). > To my knowledge the hwtracing framework is an interface for enabling/disabling traces and then you get a stream of trace data out of it. With DCC you essentially write a small "program" to be run at the time of an exception (or triggered manually). When the "program" is run it acquire data from mmio interfaces and stores data in sram, which can then be retrieved - possibly after the fatal reset of the system. Perhaps I've misunderstood the hwtracing framework, please help me steer Souradeep towards a subsystem you find suitable for this functionality. > Can you send an updated pull request that leaves out the > DCC driver until we have clarified these points? > I will send a new pull request, with the driver addition reverted. I don't think there's anything controversial with the DT binding, so let's keep that and the dts nodes (we can move the yaml if a better home is found...). Regards, Bjorn
On Mon, Jan 30, 2023, at 23:24, Bjorn Andersson wrote: > On Mon, Jan 30, 2023 at 04:18:45PM +0100, Arnd Bergmann wrote: >> On Thu, Jan 26, 2023, at 17:30, Bjorn Andersson wrote: >> >> I don't feel comfortable merging the DCC driver through drivers/soc/ >> at this point: This is the first time I see the driver and it introduces >> a complex user space ABI that I have no time to review as part of the >> merge process. >> > > The DCC driver has made 22 versions over the last 23 months, but now > that I look back I do agree that the recipients list has been too > limited. > > Further more, due to the complexity of the ABI I steered this towards > debugfs, with the explicit mentioning that we will change the interface > if needed - in particular since not a lot of review interest has > been shown... I'm sorry to hear this has already taken so long, I understand it's frustrating to come up with a good userspace interface for any of this. >> I usually try to avoid adding any custom user space interfaces >> in drivers/soc, as these tend to be things that end up being >> similar to other chips and need a generic interface. >> > > I have no concern with that, but I'm not able to suggest an existing > subsystem where this would fit. > >> In particular I don't see an explanation about how the new interface >> relates to the established drivers/hwtracing/ subsystem and why it >> shouldn't be part of that (adding the hwtracing and coresight >> maintainers to Cc in case they have looked at this already). >> > > To my knowledge the hwtracing framework is an interface for > enabling/disabling traces and then you get a stream of trace data out of > it. > > With DCC you essentially write a small "program" to be run at the time > of an exception (or triggered manually). When the "program" is run it > acquire data from mmio interfaces and stores data in sram, which can > then be retrieved - possibly after the fatal reset of the system. > > Perhaps I've misunderstood the hwtracing framework, please help me steer > Souradeep towards a subsystem you find suitable for this functionality. I'm also not too familiar with tracing infrastructure and was hoping that the coresight maintainers (Mathieu, Suzuki, Mike and Leo) would have some suggestions here. My initial guess was that in both cases, you have hardware support that is abstracted by the kernel in order to have a user interface that can be consumed by the 'perf' tool. I probably misinterpreted the part about the crash based trigger here, as my original (brief) reading was that the data snapshot could be triggered by any kind of event in the machine, which would make this useful as a more general way of tracing the state of devices at runtime. Can you describe how the crash trigger works, and if this would be usable with other random hardware events aside from an explicit software event? I've added the perf maintainers to Cc as well now, for reference, the now reverted commit is at https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux.git/commit/?h=drivers-for-6.3&id=4cbe60cf5ad62 and it contains both the implementation and the documentation of the debugfs interface. One bit I don't see is the user space side. Is there a patch for perf as well, or is the idea to use a custom tool for this? How does userspace know which MMIO addresses are even valid here? If the possible use is purely for saving some state across a reboot, as opposed to other events, I wonder if there is a good way to integrate it into the fs/pstore/ code, which already has a way to multiplex various kinds of input (log buffer, ftrace call chain, userspace strings, ...) into various kinds of persistent buffers (sram, blockdev, mtd, efivars, ...) with the purpose of helping analyze the state after a reboot. >> Can you send an updated pull request that leaves out the >> DCC driver until we have clarified these points? >> > > I will send a new pull request, with the driver addition reverted. I > don't think there's anything controversial with the DT binding, so let's > keep that and the dts nodes (we can move the yaml if a better home is > found...) Right, this is fine. I merged the first pull request after I saw the revert in the second drivers branch, though I did not see a pull request from you that replaced the first one with just the revert added as I had expected. Also, I see that patchwork never noticed me merging the PR, so you did not get the automated email. Maybe you can double check the contents of the soc/drivers branch to see if the contents are what you expect them to be. Arnd
On Wed, Feb 15, 2023 at 04:05:36PM +0100, Arnd Bergmann wrote: [...] > > To my knowledge the hwtracing framework is an interface for > > enabling/disabling traces and then you get a stream of trace data out of > > it. > > > > With DCC you essentially write a small "program" to be run at the time > > of an exception (or triggered manually). When the "program" is run it > > acquire data from mmio interfaces and stores data in sram, which can > > then be retrieved - possibly after the fatal reset of the system. > > > > Perhaps I've misunderstood the hwtracing framework, please help me steer > > Souradeep towards a subsystem you find suitable for this functionality. > > I'm also not too familiar with tracing infrastructure and was hoping > that the coresight maintainers (Mathieu, Suzuki, Mike and Leo) > would have some suggestions here. My initial guess was that in > both cases, you have hardware support that is abstracted by the > kernel in order to have a user interface that can be consumed > by the 'perf' tool. My understanding is hwtracing provides a common framework for STM so that different tracing IPs (like Intel_th and Arm CoreSight) can register STM module into this framework. The framework code is placed in: linux/drivers/hwtracing/stm. Now kernel doesn't provide a general framework for all hardware tracing IPs, e.g. Arm CoreSight has its own framework to manage tracing components and creating links with sinks. Simply to say, we can place DCC driver in linux/drivers/hwtracing folder (like Hisilicon's ptt driver), but we have no common framework for it to use. Based on reading DCC's driver, seems to me it's more like a bus tracing module rather than a uncore PMU. I found the driver does not support interrupt, I am not sure this is a hardware limitation or just software doesn't implement the interrupt handling, without interrupt, it would be difficult for using DCC for profiling. If we register DCC into perf framework, the good thing is DCC can use perf framework (e.g. perf's configs) as its user space interface, but it's still not clear for me how to capture the DCC trace data (no interrupt and not relevant with process task switching). [...] > If the possible use is purely for saving some state across > a reboot, as opposed to other events, I wonder if there is > a good way to integrate it into the fs/pstore/ code, which > already has a way to multiplex various kinds of input (log > buffer, ftrace call chain, userspace strings, ...) into > various kinds of persistent buffers (sram, blockdev, mtd, > efivars, ...) with the purpose of helping analyze the > state after a reboot. Good point! I understand pstore/ramoops is somehow like a sink which routes the tracing data (software tracing data but not hadware tracing data) to persistent memory. This is why we also can route these software tracing data to STM (hardware sink!). Seems to me, Arnd suggests to connect two sinks between DCC and pstore (to persistent memory). But I cannot give an example code in kernel for doing this way, sorry if I miss something. Essentially, a good user case is to keep a persistent memory for the tracing data, then after rebooting cycle we can retrieve the tracing data via user space interface (like sysfs node). Thanks, Leo
On 2/22/2023 7:13 AM, Leo Yan wrote: > On Wed, Feb 15, 2023 at 04:05:36PM +0100, Arnd Bergmann wrote: > > [...] > >>> To my knowledge the hwtracing framework is an interface for >>> enabling/disabling traces and then you get a stream of trace data out of >>> it. >>> >>> With DCC you essentially write a small "program" to be run at the time >>> of an exception (or triggered manually). When the "program" is run it >>> acquire data from mmio interfaces and stores data in sram, which can >>> then be retrieved - possibly after the fatal reset of the system. >>> >>> Perhaps I've misunderstood the hwtracing framework, please help me steer >>> Souradeep towards a subsystem you find suitable for this functionality. >> >> I'm also not too familiar with tracing infrastructure and was hoping >> that the coresight maintainers (Mathieu, Suzuki, Mike and Leo) >> would have some suggestions here. My initial guess was that in >> both cases, you have hardware support that is abstracted by the >> kernel in order to have a user interface that can be consumed >> by the 'perf' tool. > > My understanding is hwtracing provides a common framework for STM so > that different tracing IPs (like Intel_th and Arm CoreSight) can > register STM module into this framework. The framework code is placed > in: linux/drivers/hwtracing/stm. > > Now kernel doesn't provide a general framework for all hardware tracing > IPs, e.g. Arm CoreSight has its own framework to manage tracing > components and creating links with sinks. > > Simply to say, we can place DCC driver in linux/drivers/hwtracing folder > (like Hisilicon's ptt driver), but we have no common framework for it to > use. > > Based on reading DCC's driver, seems to me it's more like a bus tracing > module rather than a uncore PMU. I found the driver does not support > interrupt, I am not sure this is a hardware limitation or just software > doesn't implement the interrupt handling, without interrupt, it would be > difficult for using DCC for profiling. > > If we register DCC into perf framework, the good thing is DCC can use > perf framework (e.g. perf's configs) as its user space interface, but > it's still not clear for me how to capture the DCC trace data (no > interrupt and not relevant with process task switching). > > [...] > >> If the possible use is purely for saving some state across >> a reboot, as opposed to other events, I wonder if there is >> a good way to integrate it into the fs/pstore/ code, which >> already has a way to multiplex various kinds of input (log >> buffer, ftrace call chain, userspace strings, ...) into >> various kinds of persistent buffers (sram, blockdev, mtd, >> efivars, ...) with the purpose of helping analyze the >> state after a reboot. > > Good point! > > I understand pstore/ramoops is somehow like a sink which routes the > tracing data (software tracing data but not hadware tracing data) to > persistent memory. This is why we also can route these software > tracing data to STM (hardware sink!). > > Seems to me, Arnd suggests to connect two sinks between DCC and > pstore (to persistent memory). But I cannot give an example code in > kernel for doing this way, sorry if I miss something. > > Essentially, a good user case is to keep a persistent memory for the > tracing data, then after rebooting cycle we can retrieve the tracing > data via user space interface (like sysfs node). Hi Leo/Arnd, Just wanted to let you know that the justification of not using PStore was already given in the version 1 of this patch series as below https://lore.kernel.org/linux-arm-msm/ab30490c016f906fd9bc5d789198530b@codeaurora.org/#r PStore/Ramoops only persists across warm-reboots which is present for chrome devices but not for android ones. Also the dcc_sram contents can also be collected by going for a software trigger after loading the kernel and the dcc_sram is parsed to get the register values with the opensource parser as below https://source.codeaurora.org/quic/la/platform/vendor/qcom-opensource/tools/tree/dcc_parser Pstore on the other hand can only be collected on the next reboot. Thanks, Souradeep > > Thanks, > Leo
Hi Souradeep, On Wed, Feb 22, 2023 at 04:46:07PM +0530, Souradeep Chowdhury wrote: > On 2/22/2023 7:13 AM, Leo Yan wrote: > > On Wed, Feb 15, 2023 at 04:05:36PM +0100, Arnd Bergmann wrote: [...] > > > If the possible use is purely for saving some state across > > > a reboot, as opposed to other events, I wonder if there is > > > a good way to integrate it into the fs/pstore/ code, which > > > already has a way to multiplex various kinds of input (log > > > buffer, ftrace call chain, userspace strings, ...) into > > > various kinds of persistent buffers (sram, blockdev, mtd, > > > efivars, ...) with the purpose of helping analyze the > > > state after a reboot. > > > > Good point! > > > > I understand pstore/ramoops is somehow like a sink which routes the > > tracing data (software tracing data but not hadware tracing data) to > > persistent memory. This is why we also can route these software > > tracing data to STM (hardware sink!). > > > > Seems to me, Arnd suggests to connect two sinks between DCC and > > pstore (to persistent memory). But I cannot give an example code in > > kernel for doing this way, sorry if I miss something. > > > > Essentially, a good user case is to keep a persistent memory for the > > tracing data, then after rebooting cycle we can retrieve the tracing > > data via user space interface (like sysfs node). > > Hi Leo/Arnd, > > Just wanted to let you know that the justification of not using PStore was > already given in the version 1 of this patch series as below > > https://lore.kernel.org/linux-arm-msm/ab30490c016f906fd9bc5d789198530b@codeaurora.org/#r > > PStore/Ramoops only persists across warm-reboots which is present for chrome > devices but not for android ones. Thanks for the info. Just remind a subtle difference of reboots. Besides warm reboot, kernel can reboot system after panic (see kernel command line option `panic`) and watchdog can reboot the system as well. Even though Android doesn't support warm reboot, system still can reboot on panic or by watchdog (in particular after bus lockup), pstore/ramoops also can support these cases. > Also the dcc_sram contents can > also be collected by going for a software trigger after loading the kernel > and the dcc_sram is parsed to get the register values with the > opensource parser as below > > https://source.codeaurora.org/quic/la/platform/vendor/qcom-opensource/tools/tree/dcc_parser To be clear, current driver is fine for me (TBH, I didn't spend much time to read it but it's very neat after quickly went through it), I just share some info in case it's helpful for the discussion. Thanks, Leo
On 2/22/2023 5:43 PM, Leo Yan wrote: > Hi Souradeep, > > On Wed, Feb 22, 2023 at 04:46:07PM +0530, Souradeep Chowdhury wrote: >> On 2/22/2023 7:13 AM, Leo Yan wrote: >>> On Wed, Feb 15, 2023 at 04:05:36PM +0100, Arnd Bergmann wrote: > > [...] > >>>> If the possible use is purely for saving some state across >>>> a reboot, as opposed to other events, I wonder if there is >>>> a good way to integrate it into the fs/pstore/ code, which >>>> already has a way to multiplex various kinds of input (log >>>> buffer, ftrace call chain, userspace strings, ...) into >>>> various kinds of persistent buffers (sram, blockdev, mtd, >>>> efivars, ...) with the purpose of helping analyze the >>>> state after a reboot. >>> >>> Good point! >>> >>> I understand pstore/ramoops is somehow like a sink which routes the >>> tracing data (software tracing data but not hadware tracing data) to >>> persistent memory. This is why we also can route these software >>> tracing data to STM (hardware sink!). >>> >>> Seems to me, Arnd suggests to connect two sinks between DCC and >>> pstore (to persistent memory). But I cannot give an example code in >>> kernel for doing this way, sorry if I miss something. >>> >>> Essentially, a good user case is to keep a persistent memory for the >>> tracing data, then after rebooting cycle we can retrieve the tracing >>> data via user space interface (like sysfs node). >> >> Hi Leo/Arnd, >> >> Just wanted to let you know that the justification of not using PStore was >> already given in the version 1 of this patch series as below >> >> https://lore.kernel.org/linux-arm-msm/ab30490c016f906fd9bc5d789198530b@codeaurora.org/#r >> >> PStore/Ramoops only persists across warm-reboots which is present for chrome >> devices but not for android ones. > > Thanks for the info. Just remind a subtle difference of reboots. > > Besides warm reboot, kernel can reboot system after panic (see kernel > command line option `panic`) and watchdog can reboot the system as well. > > Even though Android doesn't support warm reboot, system still can reboot > on panic or by watchdog (in particular after bus lockup), pstore/ramoops > also can support these cases. So for the SoCs that doesn't support warm reboots, the DDR memory is non persistent across panics or watchdog bites in which case the PStore/Ramoops cannot be of use. > >> Also the dcc_sram contents can >> also be collected by going for a software trigger after loading the kernel >> and the dcc_sram is parsed to get the register values with the >> opensource parser as below >> >> https://source.codeaurora.org/quic/la/platform/vendor/qcom-opensource/tools/tree/dcc_parser > > To be clear, current driver is fine for me (TBH, I didn't spend much > time to read it but it's very neat after quickly went through it), I > just share some info in case it's helpful for the discussion. > > Thanks, > Leo
On 2/27/2023 4:43 AM, Souradeep Chowdhury wrote: > > > On 2/22/2023 5:43 PM, Leo Yan wrote: >> Hi Souradeep, >> >> On Wed, Feb 22, 2023 at 04:46:07PM +0530, Souradeep Chowdhury wrote: >>> On 2/22/2023 7:13 AM, Leo Yan wrote: >>>> On Wed, Feb 15, 2023 at 04:05:36PM +0100, Arnd Bergmann wrote: >> >> [...] >> >>>>> If the possible use is purely for saving some state across >>>>> a reboot, as opposed to other events, I wonder if there is >>>>> a good way to integrate it into the fs/pstore/ code, which >>>>> already has a way to multiplex various kinds of input (log >>>>> buffer, ftrace call chain, userspace strings, ...) into >>>>> various kinds of persistent buffers (sram, blockdev, mtd, >>>>> efivars, ...) with the purpose of helping analyze the >>>>> state after a reboot. >>>> >>>> Good point! >>>> >>>> I understand pstore/ramoops is somehow like a sink which routes the >>>> tracing data (software tracing data but not hadware tracing data) to >>>> persistent memory. This is why we also can route these software >>>> tracing data to STM (hardware sink!). >>>> >>>> Seems to me, Arnd suggests to connect two sinks between DCC and >>>> pstore (to persistent memory). But I cannot give an example code in >>>> kernel for doing this way, sorry if I miss something. >>>> >>>> Essentially, a good user case is to keep a persistent memory for the >>>> tracing data, then after rebooting cycle we can retrieve the tracing >>>> data via user space interface (like sysfs node). >>> >>> Hi Leo/Arnd, >>> >>> Just wanted to let you know that the justification of not using >>> PStore was >>> already given in the version 1 of this patch series as below >>> >>> https://lore.kernel.org/linux-arm-msm/ab30490c016f906fd9bc5d789198530b@codeaurora.org/#r >>> >>> PStore/Ramoops only persists across warm-reboots which is present for >>> chrome >>> devices but not for android ones. >> >> Thanks for the info. Just remind a subtle difference of reboots. >> >> Besides warm reboot, kernel can reboot system after panic (see kernel >> command line option `panic`) and watchdog can reboot the system as well. >> >> Even though Android doesn't support warm reboot, system still can reboot >> on panic or by watchdog (in particular after bus lockup), pstore/ramoops >> also can support these cases. > > > So for the SoCs that doesn't support warm reboots, the DDR memory is non > persistent across panics or watchdog bites in which case the > PStore/Ramoops cannot be of use. > > >> >>> Also the dcc_sram contents can >>> also be collected by going for a software trigger after loading the >>> kernel >>> and the dcc_sram is parsed to get the register values with the >>> opensource parser as below >>> >>> https://source.codeaurora.org/quic/la/platform/vendor/qcom-opensource/tools/tree/dcc_parser >> >> To be clear, current driver is fine for me (TBH, I didn't spend much >> time to read it but it's very neat after quickly went through it), I >> just share some info in case it's helpful for the discussion. What is the conclusion here? Can we pick up the DCC now if we rebase to the latest tree? It seems so far response here is that driver is fine as-is and it can be included without any changes. I want this driver discussion to be concluded since we are trying to submit it for more than 23 months (as Bjorn counted :) ). ---Trilok Soni