Message ID | 20201203212640.663931-1-luzmaximilian@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Add support for Microsoft Surface System Aggregator Module | expand |
On Thu, Dec 03, 2020 at 10:26:31PM +0100, Maximilian Luz wrote: > Hello, > > Here is version two of the Surface System Aggregator Module (SAM/SSAM) > driver series, adding initial support for the embedded controller on 5th > and later generation Microsoft Surface devices. Initial support includes > the ACPI interface to the controller, via which battery and thermal > information is provided on some of these devices. > > The previous version and cover letter detailing what this series is > about can be found at > > https://lore.kernel.org/platform-driver-x86/20201115192143.21571-1-luzmaximilian@gmail.com/ > > This patch-set can also be found at the following repository and > reference, if you prefer to look at a kernel tree instead of these > emails: > > https://github.com/linux-surface/kernel tags/s/surface-aggregator/v2 > > Thank you all for the feedback to v1, I hope I have addressed all > comments. I think that it is too far fetched to attempt and expose UAPI headers for some obscure char device that we are all know won't be around in a couple of years from now due to the nature of how this embedded world works. More on that, the whole purpose of proposed interface is to debug and not intended to be used by any user space code. Also the idea that you are creating new bus just for this device doesn't really sound right. I recommend you to take a look on auxiliary bus and use it or come with very strong justifications why it is not fit yet. I'm sorry to say, but this series is not ready to be merged yet. NAK: Leon Romanovsky <leon@kernel.org> Thanks
On Sun, Dec 06, 2020 at 09:07:05AM +0200, Leon Romanovsky wrote: > On Thu, Dec 03, 2020 at 10:26:31PM +0100, Maximilian Luz wrote: > > Hello, > > > > Here is version two of the Surface System Aggregator Module (SAM/SSAM) > > driver series, adding initial support for the embedded controller on 5th > > and later generation Microsoft Surface devices. Initial support includes > > the ACPI interface to the controller, via which battery and thermal > > information is provided on some of these devices. > > > > The previous version and cover letter detailing what this series is > > about can be found at > > > > https://lore.kernel.org/platform-driver-x86/20201115192143.21571-1-luzmaximilian@gmail.com/ > > > > This patch-set can also be found at the following repository and > > reference, if you prefer to look at a kernel tree instead of these > > emails: > > > > https://github.com/linux-surface/kernel tags/s/surface-aggregator/v2 > > > > Thank you all for the feedback to v1, I hope I have addressed all > > comments. > > > I think that it is too far fetched to attempt and expose UAPI headers > for some obscure char device that we are all know won't be around in > a couple of years from now due to the nature of how this embedded world > works. No, that's not ok, we do this for loads of devices out there. If there is a device that wants to be supported for Linux, and a developer that wants to support it, we will take it. > More on that, the whole purpose of proposed interface is to debug and > not intended to be used by any user space code. I thought that debugfs was going to be used for most of the debugging code, or has that changed in newer versions of this patchset? thanks, greg k-h
On Sun, Dec 06, 2020 at 09:32:30AM +0100, Greg Kroah-Hartman wrote: > On Sun, Dec 06, 2020 at 09:07:05AM +0200, Leon Romanovsky wrote: > > On Thu, Dec 03, 2020 at 10:26:31PM +0100, Maximilian Luz wrote: > > > Hello, > > > > > > Here is version two of the Surface System Aggregator Module (SAM/SSAM) > > > driver series, adding initial support for the embedded controller on 5th > > > and later generation Microsoft Surface devices. Initial support includes > > > the ACPI interface to the controller, via which battery and thermal > > > information is provided on some of these devices. > > > > > > The previous version and cover letter detailing what this series is > > > about can be found at > > > > > > https://lore.kernel.org/platform-driver-x86/20201115192143.21571-1-luzmaximilian@gmail.com/ > > > > > > This patch-set can also be found at the following repository and > > > reference, if you prefer to look at a kernel tree instead of these > > > emails: > > > > > > https://github.com/linux-surface/kernel tags/s/surface-aggregator/v2 > > > > > > Thank you all for the feedback to v1, I hope I have addressed all > > > comments. > > > > > > I think that it is too far fetched to attempt and expose UAPI headers > > for some obscure char device that we are all know won't be around in > > a couple of years from now due to the nature of how this embedded world > > works. > > No, that's not ok, we do this for loads of devices out there. If there > is a device that wants to be supported for Linux, and a developer that > wants to support it, we will take it. > > > More on that, the whole purpose of proposed interface is to debug and > > not intended to be used by any user space code. > > I thought that debugfs was going to be used for most of the debugging > code, or has that changed in newer versions of this patchset? https://lore.kernel.org/lkml/20201203212640.663931-9-luzmaximilian@gmail.com/#Z30include:uapi:linux:surface_aggregator:cdev.h + * Definitions, structs, and IOCTLs for the /dev/surface/aggregator misc + * device. This device provides direct user-space access to the SSAM EC. + * Intended for debugging and development. > > thanks, > > greg k-h
Hi Leon, On 12/6/20 8:07 AM, Leon Romanovsky wrote: > On Thu, Dec 03, 2020 at 10:26:31PM +0100, Maximilian Luz wrote: >> Hello, >> >> Here is version two of the Surface System Aggregator Module (SAM/SSAM) >> driver series, adding initial support for the embedded controller on 5th >> and later generation Microsoft Surface devices. Initial support includes >> the ACPI interface to the controller, via which battery and thermal >> information is provided on some of these devices. >> >> The previous version and cover letter detailing what this series is >> about can be found at >> >> https://lore.kernel.org/platform-driver-x86/20201115192143.21571-1-luzmaximilian@gmail.com/ >> >> This patch-set can also be found at the following repository and >> reference, if you prefer to look at a kernel tree instead of these >> emails: >> >> https://github.com/linux-surface/kernel tags/s/surface-aggregator/v2 >> >> Thank you all for the feedback to v1, I hope I have addressed all >> comments. > > > I think that it is too far fetched to attempt and expose UAPI headers > for some obscure char device that we are all know won't be around in > a couple of years from now due to the nature of how this embedded world > works. This is not for an embedded device, but for the popular line of Microsoft Surface laptops / 2-in-1s... > More on that, the whole purpose of proposed interface is to debug and > not intended to be used by any user space code. The purpose is to provide raw access to the Surface Serial Hub protocol, just like we provide raw access to USB devices and have hidraw devices. So this goes a litle beyond just debugging; and eventually the choice may be made to implement some functionality with userspace drivers, just like we do for some HID and USB devices. Still I agree with you that adding new userspace API is something which needs to be considered carefully. So I will look at this closely when reviewing this set. > Also the idea that you are creating new bus just for this device doesn't > really sound right. I recommend you to take a look on auxiliary bus and > use it or come with very strong justifications why it is not fit yet. AFAIK the auxiliary bus is for sharing a single device between multiple drivers, while the main device-driver also still offers functionality (beyond the providing of access) itself. This is more akin to how the WMI driver also models different WMI functions as a bus + devices on the bus. Or how the SDIO driver multiplex a single SDIO device into its functions by again using a bus + devices on the bus model. Also this has been in the works for quite a while now, the Linux on Microsoft Surface devices community has been working on this out of tree for a long time, see: https://github.com/linux-surface/ And an RFC and a v1 have been posted a while ago, while auxiliary bus support is not even in the mainline kernel yet. I would agree with you that this should switch to auxiliary bus, despite the timing, if that would lead to much better code. But ATM I don't really see switching to auxiliary bus offering much benefits here. > I'm sorry to say, but this series is not ready to be merged yet. > > NAK: Leon Romanovsky <leon@kernel.org> See above, I believe that this all is a bit harsh and I have not really heard convincing arguments for not merging this. Moreover such a quick nack does not really promote working upstream, where as we actually want people to work upstream as much as possible. I know this is not a reason for taking bad code, but I'm not convinced that this is bad code. I have not reviewed this myself yet, but once I have reviewed this and any review remarks have been addressed I do expect to merge this series through the platform-drivers-x86 tree. Regards, Hans de Goede (drivers/platform/x86 and drivers/platform/surface subsys maintainer)
On Sun, Dec 06, 2020 at 09:41:21AM +0100, Hans de Goede wrote: > Hi Leon, > > On 12/6/20 8:07 AM, Leon Romanovsky wrote: > > On Thu, Dec 03, 2020 at 10:26:31PM +0100, Maximilian Luz wrote: > >> Hello, > >> > >> Here is version two of the Surface System Aggregator Module (SAM/SSAM) > >> driver series, adding initial support for the embedded controller on 5th > >> and later generation Microsoft Surface devices. Initial support includes > >> the ACPI interface to the controller, via which battery and thermal > >> information is provided on some of these devices. > >> > >> The previous version and cover letter detailing what this series is > >> about can be found at > >> > >> https://lore.kernel.org/platform-driver-x86/20201115192143.21571-1-luzmaximilian@gmail.com/ > >> > >> This patch-set can also be found at the following repository and > >> reference, if you prefer to look at a kernel tree instead of these > >> emails: > >> > >> https://github.com/linux-surface/kernel tags/s/surface-aggregator/v2 > >> > >> Thank you all for the feedback to v1, I hope I have addressed all > >> comments. > > > > > > I think that it is too far fetched to attempt and expose UAPI headers > > for some obscure char device that we are all know won't be around in > > a couple of years from now due to the nature of how this embedded world > > works. > > This is not for an embedded device, but for the popular line of > Microsoft Surface laptops / 2-in-1s... It is the naming, we don't have char device for every "laptop" vendor. Why is Microsoft different here? > > > More on that, the whole purpose of proposed interface is to debug and > > not intended to be used by any user space code. > > The purpose is to provide raw access to the Surface Serial Hub protocol, > just like we provide raw access to USB devices and have hidraw devices. USB devices implement standard protocol, this surface hub is nothing even close to that. > > So this goes a litle beyond just debugging; and eventually the choice > may be made to implement some functionality with userspace drivers, > just like we do for some HID and USB devices. I don't know how it goes in device/platform area, but in other large subsystems, UAPI should be presented with working user-space part. > > Still I agree with you that adding new userspace API is something which > needs to be considered carefully. So I will look at this closely when > reviewing this set. > > > Also the idea that you are creating new bus just for this device doesn't > > really sound right. I recommend you to take a look on auxiliary bus and > > use it or come with very strong justifications why it is not fit yet. > > AFAIK the auxiliary bus is for sharing a single device between multiple > drivers, while the main device-driver also still offers functionality > (beyond the providing of access) itself. The idea behind auxiliary bus is to slice various functionalities into different sub-drivers, see it as a way to create subsystem inside one driver. > > This is more akin to how the WMI driver also models different WMI > functions as a bus + devices on the bus. > > Or how the SDIO driver multiplex a single SDIO device into its > functions by again using a bus + devices on the bus model. > > Also this has been in the works for quite a while now, the Linux on > Microsoft Surface devices community has been working on this out of > tree for a long time, see: > https://github.com/linux-surface/ It is not relevant, the code is merged than it is ready. > > And an RFC and a v1 have been posted a while ago, while auxiliary > bus support is not even in the mainline kernel yet. I would agree > with you that this should switch to auxiliary bus, despite the timing, > if that would lead to much better code. But ATM I don't really see > switching to auxiliary bus offering much benefits here. The auxiliary bus is merged and part of linux-next. https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git/tag/?h=auxbus-5.11-rc1 > > > I'm sorry to say, but this series is not ready to be merged yet. > > > > NAK: Leon Romanovsky <leon@kernel.org> > > See above, I believe that this all is a bit harsh and I have not > really heard convincing arguments for not merging this. > > Moreover such a quick nack does not really promote working upstream, > where as we actually want people to work upstream as much as possible. > I know this is not a reason for taking bad code, but I'm not > convinced that this is bad code. I naked explicitly for two reasons: UAPI(chardev) and lack of rationale for the custom bus, never said "bad code". > > I have not reviewed this myself yet, but once I have reviewed > this and any review remarks have been addressed I do expect to > merge this series through the platform-drivers-x86 tree. > > Regards, > > Hans de Goede > (drivers/platform/x86 and drivers/platform/surface subsys maintainer) >
> > > More on that, the whole purpose of proposed interface is to debug and > > not intended to be used by any user space code. > > The purpose is to provide raw access to the Surface Serial Hub protocol, > just like we provide raw access to USB devices and have hidraw devices. > > So this goes a litle beyond just debugging; and eventually the choice > may be made to implement some functionality with userspace drivers, > just like we do for some HID and USB devices. > > Still I agree with you that adding new userspace API is something which > needs to be considered carefully. So I will look at this closely when > reviewing this set. To add to that: this was previously a debugfs interface but was moved to misc after review on the initial RFC: https://lkml.org/lkml/2020/9/24/96
On Sun, Dec 06, 2020 at 05:58:32PM +0900, Blaž Hrastnik wrote: > > > > > More on that, the whole purpose of proposed interface is to debug and > > > not intended to be used by any user space code. > > > > The purpose is to provide raw access to the Surface Serial Hub protocol, > > just like we provide raw access to USB devices and have hidraw devices. > > > > So this goes a litle beyond just debugging; and eventually the choice > > may be made to implement some functionality with userspace drivers, > > just like we do for some HID and USB devices. > > > > Still I agree with you that adding new userspace API is something which > > needs to be considered carefully. So I will look at this closely when > > reviewing this set. > > To add to that: this was previously a debugfs interface but was moved to misc after review on the initial RFC: > https://lkml.org/lkml/2020/9/24/96 There is a huge difference between the suggestion and final implementation. Greg suggested to add new debug module to the drivers/misc that will open char device explicitly after user loaded that module to debug this hub. However, the author added full blown char device as a first citizen that has all not-break-user constrains. Thanks
Hi, On 12/6/20 9:56 AM, Leon Romanovsky wrote: > On Sun, Dec 06, 2020 at 09:41:21AM +0100, Hans de Goede wrote: >> Hi Leon, >> >> On 12/6/20 8:07 AM, Leon Romanovsky wrote: >>> On Thu, Dec 03, 2020 at 10:26:31PM +0100, Maximilian Luz wrote: >>>> Hello, >>>> >>>> Here is version two of the Surface System Aggregator Module (SAM/SSAM) >>>> driver series, adding initial support for the embedded controller on 5th >>>> and later generation Microsoft Surface devices. Initial support includes >>>> the ACPI interface to the controller, via which battery and thermal >>>> information is provided on some of these devices. >>>> >>>> The previous version and cover letter detailing what this series is >>>> about can be found at >>>> >>>> https://lore.kernel.org/platform-driver-x86/20201115192143.21571-1-luzmaximilian@gmail.com/ >>>> >>>> This patch-set can also be found at the following repository and >>>> reference, if you prefer to look at a kernel tree instead of these >>>> emails: >>>> >>>> https://github.com/linux-surface/kernel tags/s/surface-aggregator/v2 >>>> >>>> Thank you all for the feedback to v1, I hope I have addressed all >>>> comments. >>> >>> >>> I think that it is too far fetched to attempt and expose UAPI headers >>> for some obscure char device that we are all know won't be around in >>> a couple of years from now due to the nature of how this embedded world >>> works. >> >> This is not for an embedded device, but for the popular line of >> Microsoft Surface laptops / 2-in-1s... > > It is the naming, we don't have char device for every "laptop" vendor. > Why is Microsoft different here? Because their hardware department has invented a whole new way of dealing with a bunch of things at the hardware level (for some reason). Also almost all laptop vendors have a whole bunch of laptop vendor specific userspace API in the form of sysfs files exported by drivers/platform/x86/laptop-vendor.c drivers. E.g. do: ls /sys/bus/platform/devices/thinkpad_acpi/ An any IBM/Lenovo Thinkpad (and only on a Thinkpad) to see a bunch of laptop vendor specific UAPI. Since I've become the pdx86 subsys maintainer I've actually been pushing back against adding more of this, instead trying to either use existing UAPIs, or defining new common UAPIs which can be shared between vendors. So I agree very much with you that we need to be careful about needlessly introducing new UAPI. But there is a difference between being careful and just nacking it because no new UAPI may be added at all (also see GKH's response). >>> More on that, the whole purpose of proposed interface is to debug and >>> not intended to be used by any user space code. >> >> The purpose is to provide raw access to the Surface Serial Hub protocol, >> just like we provide raw access to USB devices and have hidraw devices. > > USB devices implement standard protocol, this surface hub is nothing > even close to that. The USB protocol just defines a transport layer, outside of the USB classes there are plenty of proprietary protocols on top of that transport. And this chardev just offers access to the Surface Serial Hub transport protocol. And if you want something even closer the i2cdev module offers raw I2C transfer access and I2C defines no protocol other then how to read or write a number of bytes. I do a lot of hw enablement work and being able to poke HID / USB / I2C devices directly from userspace is very useful for this. >> So this goes a litle beyond just debugging; and eventually the choice >> may be made to implement some functionality with userspace drivers, >> just like we do for some HID and USB devices. > > I don't know how it goes in device/platform area, but in other large > subsystems, UAPI should be presented with working user-space part. > >> >> Still I agree with you that adding new userspace API is something which >> needs to be considered carefully. So I will look at this closely when >> reviewing this set. So this ^^^ still stands, I agree with you that adding new UAPI needs to be considered carefully and when I get around to reviewing this that is exactly what I will do. Maximilian, can you perhaps explain a bit more of what you want / expect to use the chardev for, and maybe provide pointers to the matching userspace utilities (which I presume you have) ? >>> Also the idea that you are creating new bus just for this device doesn't >>> really sound right. I recommend you to take a look on auxiliary bus and >>> use it or come with very strong justifications why it is not fit yet. >> >> AFAIK the auxiliary bus is for sharing a single device between multiple >> drivers, while the main device-driver also still offers functionality >> (beyond the providing of access) itself. > > The idea behind auxiliary bus is to slice various functionalities into > different sub-drivers, see it as a way to create subsystem inside one > driver. AFAIK the idea is to be able to combine multiple physical devices, e.g. a PCI device + an ACPI enumerated platform device and then slice the combination of those 2 up in new devices which may use parts of both parent devices, quoting from: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git/tree/Documentation/driver-api/auxiliary_bus.rst?h=driver-core-next&id=7de3697e9cbd4bd3d62bafa249d57990e1b8f294 "multiple devices might implement a common intersection of functionality" IOW this is for cases where the simpler bus + devices model does not work well. AFAICT in this case the simpler bus + devices does work well, so there is no need to use the auxiliary bus. >> This is more akin to how the WMI driver also models different WMI >> functions as a bus + devices on the bus. >> >> Or how the SDIO driver multiplex a single SDIO device into its >> functions by again using a bus + devices on the bus model. >> >> Also this has been in the works for quite a while now, the Linux on >> Microsoft Surface devices community has been working on this out of >> tree for a long time, see: >> https://github.com/linux-surface/ > > It is not relevant, the code is merged than it is ready. It is relevant, you cannot expect drivers which were written during the last 6 months to use functionality which is not even in the mainline kernel yet (yes it is in -next, but not in mainline). Now if that new functionality where to provide major benefits to the code making it much cleaner / better then yes asking to rewrite it to use that new functionality would make sense. I need to take a closer look at the code but ATM I'm not convinced that rewriting it to use the new auxiliary bus stuff will make it better at all, let alone enough to warrant the effort of the rewrite. <snip> Regards, Hans p.s. For the record I do not own any Microsoft Surface devices myself, so I have no interests here other then to make sure that the Linux kernel is welcoming to new new contributors and does not needlessly scare them away.
On Sun, Dec 06, 2020 at 11:04:06AM +0100, Hans de Goede wrote: > Hi, > > On 12/6/20 9:56 AM, Leon Romanovsky wrote: > > On Sun, Dec 06, 2020 at 09:41:21AM +0100, Hans de Goede wrote: > >> Hi Leon, > >> > >> On 12/6/20 8:07 AM, Leon Romanovsky wrote: > >>> On Thu, Dec 03, 2020 at 10:26:31PM +0100, Maximilian Luz wrote: > >>>> Hello, > >>>> > >>>> Here is version two of the Surface System Aggregator Module (SAM/SSAM) > >>>> driver series, adding initial support for the embedded controller on 5th > >>>> and later generation Microsoft Surface devices. Initial support includes > >>>> the ACPI interface to the controller, via which battery and thermal > >>>> information is provided on some of these devices. > >>>> > >>>> The previous version and cover letter detailing what this series is > >>>> about can be found at > >>>> > >>>> https://lore.kernel.org/platform-driver-x86/20201115192143.21571-1-luzmaximilian@gmail.com/ > >>>> > >>>> This patch-set can also be found at the following repository and > >>>> reference, if you prefer to look at a kernel tree instead of these > >>>> emails: > >>>> > >>>> https://github.com/linux-surface/kernel tags/s/surface-aggregator/v2 > >>>> > >>>> Thank you all for the feedback to v1, I hope I have addressed all > >>>> comments. > >>> > >>> > >>> I think that it is too far fetched to attempt and expose UAPI headers > >>> for some obscure char device that we are all know won't be around in > >>> a couple of years from now due to the nature of how this embedded world > >>> works. > >> > >> This is not for an embedded device, but for the popular line of > >> Microsoft Surface laptops / 2-in-1s... > > > > It is the naming, we don't have char device for every "laptop" vendor. > > Why is Microsoft different here? > > Because their hardware department has invented a whole new way of dealing > with a bunch of things at the hardware level (for some reason). They are not different from any other vendor, it is much cheaper and easier to do not follow standard implementations. > > Also almost all laptop vendors have a whole bunch of laptop vendor > specific userspace API in the form of sysfs files exported by > drivers/platform/x86/laptop-vendor.c drivers. E.g. do: > > ls /sys/bus/platform/devices/thinkpad_acpi/ It is different from the proposed /dev/surface... char device. > > An any IBM/Lenovo Thinkpad (and only on a Thinkpad) to see a bunch > of laptop vendor specific UAPI. Yes, it is gross, IBM did it in early days of Linux. Other vendors don't have anything like that. > > Since I've become the pdx86 subsys maintainer I've actually been > pushing back against adding more of this, instead trying to > either use existing UAPIs, or defining new common UAPIs which can > be shared between vendors. > > So I agree very much with you that we need to be careful about > needlessly introducing new UAPI. > > But there is a difference between being careful and just nacking > it because no new UAPI may be added at all (also see GKH's response). I saw, the author misunderstood the Greg's comments. > > >>> More on that, the whole purpose of proposed interface is to debug and > >>> not intended to be used by any user space code. > >> > >> The purpose is to provide raw access to the Surface Serial Hub protocol, > >> just like we provide raw access to USB devices and have hidraw devices. > > > > USB devices implement standard protocol, this surface hub is nothing > > even close to that. > > The USB protocol just defines a transport layer, outside of the USB classes > there are plenty of proprietary protocols on top of that transport. > > And this chardev just offers access to the Surface Serial Hub transport > protocol. And if you want something even closer the i2cdev module offers > raw I2C transfer access and I2C defines no protocol other then > how to read or write a number of bytes. > > I do a lot of hw enablement work and being able to poke HID / USB / I2C > devices directly from userspace is very useful for this. Greg wrote how to do it. > > >> So this goes a litle beyond just debugging; and eventually the choice > >> may be made to implement some functionality with userspace drivers, > >> just like we do for some HID and USB devices. > > > > I don't know how it goes in device/platform area, but in other large > > subsystems, UAPI should be presented with working user-space part. > > > >> > >> Still I agree with you that adding new userspace API is something which > >> needs to be considered carefully. So I will look at this closely when > >> reviewing this set. > > So this ^^^ still stands, I agree with you that adding new UAPI needs > to be considered carefully and when I get around to reviewing this > that is exactly what I will do. > > Maximilian, can you perhaps explain a bit more of what you want / expect > to use the chardev for, and maybe provide pointers to the matching > userspace utilities (which I presume you have) ? > > >>> Also the idea that you are creating new bus just for this device doesn't > >>> really sound right. I recommend you to take a look on auxiliary bus and > >>> use it or come with very strong justifications why it is not fit yet. > >> > >> AFAIK the auxiliary bus is for sharing a single device between multiple > >> drivers, while the main device-driver also still offers functionality > >> (beyond the providing of access) itself. > > > > The idea behind auxiliary bus is to slice various functionalities into > > different sub-drivers, see it as a way to create subsystem inside one > > driver. > > AFAIK the idea is to be able to combine multiple physical devices, e.g. > a PCI device + an ACPI enumerated platform device and then slice the > combination of those 2 up in new devices which may use parts of both > parent devices, quoting from: > > https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git/tree/Documentation/driver-api/auxiliary_bus.rst?h=driver-core-next&id=7de3697e9cbd4bd3d62bafa249d57990e1b8f294 > > "multiple devices might implement a common intersection of functionality" It is one way, another is to take one device and create many small devices out of it. https://lore.kernel.org/alsa-devel/20201026111849.1035786-1-leon@kernel.org/ > > IOW this is for cases where the simpler bus + devices model does not > work well. AFAICT in this case the simpler bus + devices does work > well, so there is no need to use the auxiliary bus. It is designed to replace invention of custom buses. > > >> This is more akin to how the WMI driver also models different WMI > >> functions as a bus + devices on the bus. > >> > >> Or how the SDIO driver multiplex a single SDIO device into its > >> functions by again using a bus + devices on the bus model. > >> > >> Also this has been in the works for quite a while now, the Linux on > >> Microsoft Surface devices community has been working on this out of > >> tree for a long time, see: > >> https://github.com/linux-surface/ > > > > It is not relevant, the code is merged than it is ready. > > It is relevant, you cannot expect drivers which were written during > the last 6 months to use functionality which is not even in the > mainline kernel yet (yes it is in -next, but not in mainline). > > Now if that new functionality where to provide major benefits to > the code making it much cleaner / better then yes asking to rewrite > it to use that new functionality would make sense. And who will rewrite it? My experience shows that right code should be written from the beginning otherwise the code has too many chances to be abandoned. Thanks
On 12/6/20 10:06 AM, Leon Romanovsky wrote:> On Sun, Dec 06, 2020 at 05:58:32PM +0900, Blaž Hrastnik wrote: >>> >>>> More on that, the whole purpose of proposed interface is to debug and >>>> not intended to be used by any user space code. >>> >>> The purpose is to provide raw access to the Surface Serial Hub protocol, >>> just like we provide raw access to USB devices and have hidraw devices. >>> >>> So this goes a litle beyond just debugging; and eventually the choice >>> may be made to implement some functionality with userspace drivers, >>> just like we do for some HID and USB devices. >>> >>> Still I agree with you that adding new userspace API is something which >>> needs to be considered carefully. So I will look at this closely when >>> reviewing this set. >> >> To add to that: this was previously a debugfs interface but was moved to misc after review on the initial RFC: >> https://lkml.org/lkml/2020/9/24/96 > > There is a huge difference between the suggestion and final implementation. > > Greg suggested to add new debug module to the drivers/misc that will > open char device explicitly after user loaded that module to debug this > hub. However, the author added full blown char device as a first citizen > that has all not-break-user constrains. This module still needs to be loaded explicitly. And (I might be wrong about this) the "not-break-user constraints" hold as soon as I register a misc device at all, no? So I don't see how this is a) any different than previously discussed with Greg and b) how the uapi header now introduces any not-break-user constraints that would not be there without it. This interface is intended as a stable interface. That's something that I committed to as soon as I decided to implement this via a misc-device. Sure, I can move the definitions in the uapi header to the module itself, but I don't see any benefit in that. If someone really wants to use this interface, they can just as well copy the definitions from the module source itself. So why not be upfront about it and make life easier for everyone? Regards, Max
Hi, On 12/6/20 11:33 AM, Leon Romanovsky wrote: > On Sun, Dec 06, 2020 at 11:04:06AM +0100, Hans de Goede wrote: <snip> >> But there is a difference between being careful and just nacking >> it because no new UAPI may be added at all (also see GKH's response). > > I saw, the author misunderstood the Greg's comments. Quoting from patch 8/9: " +============================== +User-Space EC Interface (cdev) +============================== + +The ``surface_aggregator_cdev`` module provides a misc-device for the SSAM +controller to allow for a (more or less) direct connection from user-space to +the SAM EC. It is intended to be used for development and debugging, and +therefore should not be used or relied upon in any other way. Note that this +module is not loaded automatically, but instead must be loaded manually. " If I'm not mistaken that seems to be pretty much what Greg asked for. Regards, Hans
Hi, On 12/6/20 11:33 AM, Maximilian Luz wrote: > On 12/6/20 10:06 AM, Leon Romanovsky wrote:> On Sun, Dec 06, 2020 at 05:58:32PM +0900, Blaž Hrastnik wrote: >>>> >>>>> More on that, the whole purpose of proposed interface is to debug and >>>>> not intended to be used by any user space code. >>>> >>>> The purpose is to provide raw access to the Surface Serial Hub protocol, >>>> just like we provide raw access to USB devices and have hidraw devices. >>>> >>>> So this goes a litle beyond just debugging; and eventually the choice >>>> may be made to implement some functionality with userspace drivers, >>>> just like we do for some HID and USB devices. >>>> >>>> Still I agree with you that adding new userspace API is something which >>>> needs to be considered carefully. So I will look at this closely when >>>> reviewing this set. >>> >>> To add to that: this was previously a debugfs interface but was moved to misc after review on the initial RFC: >>> https://lkml.org/lkml/2020/9/24/96 >> >> There is a huge difference between the suggestion and final implementation. >> >> Greg suggested to add new debug module to the drivers/misc that will >> open char device explicitly after user loaded that module to debug this >> hub. However, the author added full blown char device as a first citizen >> that has all not-break-user constrains. > > This module still needs to be loaded explicitly. Good then I really do not see a problem with this. > And (I might be wrong > about this) the "not-break-user constraints" hold as soon as I register > a misc device at all, no? Correct. > So I don't see how this is a) any different > than previously discussed with Greg and b) how the uapi header now > introduces any not-break-user constraints that would not be there > without it. > > This interface is intended as a stable interface. That's something that > I committed to as soon as I decided to implement this via a misc-device. > > Sure, I can move the definitions in the uapi header to the module > itself, but I don't see any benefit in that. Right, if we are going to use a misc chardev for this, then the correct thing to do is to put the API bits for that chardev under include/uapi. It would still be good if you can provide a pointer to some userspace tools using this new API; and for the next version maybe add that pointer to the commit message Regards, Hans
On 12/6/20 11:04 AM, Hans de Goede wrote: > Hi, > > On 12/6/20 9:56 AM, Leon Romanovsky wrote: >> On Sun, Dec 06, 2020 at 09:41:21AM +0100, Hans de Goede wrote: >>> Hi Leon, >>> >>> On 12/6/20 8:07 AM, Leon Romanovsky wrote: >>>> On Thu, Dec 03, 2020 at 10:26:31PM +0100, Maximilian Luz wrote: >>>>> Hello, >>>>> >>>>> Here is version two of the Surface System Aggregator Module (SAM/SSAM) >>>>> driver series, adding initial support for the embedded controller on 5th >>>>> and later generation Microsoft Surface devices. Initial support includes >>>>> the ACPI interface to the controller, via which battery and thermal >>>>> information is provided on some of these devices. >>>>> >>>>> The previous version and cover letter detailing what this series is >>>>> about can be found at >>>>> >>>>> https://lore.kernel.org/platform-driver-x86/20201115192143.21571-1-luzmaximilian@gmail.com/ >>>>> >>>>> This patch-set can also be found at the following repository and >>>>> reference, if you prefer to look at a kernel tree instead of these >>>>> emails: >>>>> >>>>> https://github.com/linux-surface/kernel tags/s/surface-aggregator/v2 >>>>> >>>>> Thank you all for the feedback to v1, I hope I have addressed all >>>>> comments. >>>> >>>> >>>> I think that it is too far fetched to attempt and expose UAPI headers >>>> for some obscure char device that we are all know won't be around in >>>> a couple of years from now due to the nature of how this embedded world >>>> works. >>> >>> This is not for an embedded device, but for the popular line of >>> Microsoft Surface laptops / 2-in-1s... >> >> It is the naming, we don't have char device for every "laptop" vendor. >> Why is Microsoft different here? > > Because their hardware department has invented a whole new way of dealing > with a bunch of things at the hardware level (for some reason). > > Also almost all laptop vendors have a whole bunch of laptop vendor > specific userspace API in the form of sysfs files exported by > drivers/platform/x86/laptop-vendor.c drivers. E.g. do: > > ls /sys/bus/platform/devices/thinkpad_acpi/ > > An any IBM/Lenovo Thinkpad (and only on a Thinkpad) to see a bunch > of laptop vendor specific UAPI. > > Since I've become the pdx86 subsys maintainer I've actually been > pushing back against adding more of this, instead trying to > either use existing UAPIs, or defining new common UAPIs which can > be shared between vendors. > > So I agree very much with you that we need to be careful about > needlessly introducing new UAPI. > > But there is a difference between being careful and just nacking > it because no new UAPI may be added at all (also see GKH's response). > >>>> More on that, the whole purpose of proposed interface is to debug and >>>> not intended to be used by any user space code. >>> >>> The purpose is to provide raw access to the Surface Serial Hub protocol, >>> just like we provide raw access to USB devices and have hidraw devices. >> >> USB devices implement standard protocol, this surface hub is nothing >> even close to that. > > The USB protocol just defines a transport layer, outside of the USB classes > there are plenty of proprietary protocols on top of that transport. > > And this chardev just offers access to the Surface Serial Hub transport > protocol. And if you want something even closer the i2cdev module offers > raw I2C transfer access and I2C defines no protocol other then > how to read or write a number of bytes. > > I do a lot of hw enablement work and being able to poke HID / USB / I2C > devices directly from userspace is very useful for this. This is pretty much the whole reason for this module. Surface devices vary from generation to generation, and I can't expect some random user to write some kernel module (or repeatedly pull 10 different versions of it)to test if some EC command does x or y. I can tell them to run a script with some arguments though. The main reason for this is development, debugging is secondary and IMHO part of development. So, IOW this interface provides direct access to the EC that would, without it, require you to write a kernel driver. The whole "this is intended for development and debugging" shtick is to deter people from using it to implement user-space based drivers. While this may be possible at some point, at the moment there is no good way to (reliably) detect which client devices are present from user-space. So any attempts at that will likely be unstable and should be implemented in the kernel anyway. It is a way of prototyping drivers though. >>> So this goes a litle beyond just debugging; and eventually the choice >>> may be made to implement some functionality with userspace drivers, >>> just like we do for some HID and USB devices. >> >> I don't know how it goes in device/platform area, but in other large >> subsystems, UAPI should be presented with working user-space part. >> >>> >>> Still I agree with you that adding new userspace API is something which >>> needs to be considered carefully. So I will look at this closely when >>> reviewing this set. > > So this ^^^ still stands, I agree with you that adding new UAPI needs > to be considered carefully and when I get around to reviewing this > that is exactly what I will do. > > Maximilian, can you perhaps explain a bit more of what you want / expect > to use the chardev for, and maybe provide pointers to the matching > userspace utilities (which I presume you have) ? Sure. There's a bunch of scripts at https://github.com/linux-surface/surface-aggregator-module/tree/master/scripts/ssam As described above, the main idea behind this simplifying development for devices that I can't test myself. See e.g. the "ctrl.py" script which can be used to send a basic command to the EC. The "hid.py" script is one that was successfully used to test commands for the Keyboard driver on the Surface Laptop 1 and 2. At some point my plan is to maybe split that out into its own repo and improve usability for all that, but I haven't gotten to that yet. [...] Regards, Max
On 12/6/20 11:43 AM, Hans de Goede wrote: > Hi, > > On 12/6/20 11:33 AM, Maximilian Luz wrote: >> On 12/6/20 10:06 AM, Leon Romanovsky wrote:> On Sun, Dec 06, 2020 at 05:58:32PM +0900, Blaž Hrastnik wrote: >>>>> >>>>>> More on that, the whole purpose of proposed interface is to debug and >>>>>> not intended to be used by any user space code. >>>>> >>>>> The purpose is to provide raw access to the Surface Serial Hub protocol, >>>>> just like we provide raw access to USB devices and have hidraw devices. >>>>> >>>>> So this goes a litle beyond just debugging; and eventually the choice >>>>> may be made to implement some functionality with userspace drivers, >>>>> just like we do for some HID and USB devices. >>>>> >>>>> Still I agree with you that adding new userspace API is something which >>>>> needs to be considered carefully. So I will look at this closely when >>>>> reviewing this set. >>>> >>>> To add to that: this was previously a debugfs interface but was moved to misc after review on the initial RFC: >>>> https://lkml.org/lkml/2020/9/24/96 >>> >>> There is a huge difference between the suggestion and final implementation. >>> >>> Greg suggested to add new debug module to the drivers/misc that will >>> open char device explicitly after user loaded that module to debug this >>> hub. However, the author added full blown char device as a first citizen >>> that has all not-break-user constrains. >> >> This module still needs to be loaded explicitly. > > Good then I really do not see a problem with this. > >> And (I might be wrong >> about this) the "not-break-user constraints" hold as soon as I register >> a misc device at all, no? > > Correct. > >> So I don't see how this is a) any different >> than previously discussed with Greg and b) how the uapi header now >> introduces any not-break-user constraints that would not be there >> without it. >> >> This interface is intended as a stable interface. That's something that >> I committed to as soon as I decided to implement this via a misc-device. >> >> Sure, I can move the definitions in the uapi header to the module >> itself, but I don't see any benefit in that. > > Right, if we are going to use a misc chardev for this, then the > correct thing to do is to put the API bits for that chardev under > include/uapi. > > It would still be good if you can provide a pointer to some userspace > tools using this new API; and for the next version maybe add that > pointer to the commit message Right, I will add that to the commit message. I just linked you the scripts in my other response, but here again for completeness: https://github.com/linux-surface/surface-aggregator-module/tree/master/scripts/ssam While I'm not using the header directly (the scripts are written in python) I still think uapi is the right place to put this (please correct me if I'm wrong). Not putting them there seems to be needless obfuscating to me. Regards, Max
On 12/6/20 9:32 AM, Greg Kroah-Hartman wrote: > On Sun, Dec 06, 2020 at 09:07:05AM +0200, Leon Romanovsky wrote: >> On Thu, Dec 03, 2020 at 10:26:31PM +0100, Maximilian Luz wrote: >>> Hello, >>> >>> Here is version two of the Surface System Aggregator Module (SAM/SSAM) >>> driver series, adding initial support for the embedded controller on 5th >>> and later generation Microsoft Surface devices. Initial support includes >>> the ACPI interface to the controller, via which battery and thermal >>> information is provided on some of these devices. >>> >>> The previous version and cover letter detailing what this series is >>> about can be found at >>> >>> https://lore.kernel.org/platform-driver-x86/20201115192143.21571-1-luzmaximilian@gmail.com/ >>> >>> This patch-set can also be found at the following repository and >>> reference, if you prefer to look at a kernel tree instead of these >>> emails: >>> >>> https://github.com/linux-surface/kernel tags/s/surface-aggregator/v2 >>> >>> Thank you all for the feedback to v1, I hope I have addressed all >>> comments. >> >> >> I think that it is too far fetched to attempt and expose UAPI headers >> for some obscure char device that we are all know won't be around in >> a couple of years from now due to the nature of how this embedded world >> works. > > No, that's not ok, we do this for loads of devices out there. If there > is a device that wants to be supported for Linux, and a developer that > wants to support it, we will take it. > >> More on that, the whole purpose of proposed interface is to debug and >> not intended to be used by any user space code. > > I thought that debugfs was going to be used for most of the debugging > code, or has that changed in newer versions of this patchset? As per previous discussion (https://lkml.org/lkml/2020/9/24/96) I have replaced the debugfs device by a misc-device with stable interface. I also believe that this is probably the better option long-term. The general idea is to have a device that has direct access to the EC/transport protocol and can be used for development and prototyping. Debugging is a part of that. So it's more akin to something raw access via i2cdev, hidraw, or raw access to USB devices as Hans de Goede mentioned in one of his mails. Note that the module must still be loaded manually Regards, Max
On Sun, Dec 06, 2020 at 11:33:40AM +0100, Maximilian Luz wrote: > On 12/6/20 10:06 AM, Leon Romanovsky wrote:> On Sun, Dec 06, 2020 at 05:58:32PM +0900, Blaž Hrastnik wrote: > > > > > > > > > More on that, the whole purpose of proposed interface is to debug and > > > > > not intended to be used by any user space code. > > > > > > > > The purpose is to provide raw access to the Surface Serial Hub protocol, > > > > just like we provide raw access to USB devices and have hidraw devices. > > > > > > > > So this goes a litle beyond just debugging; and eventually the choice > > > > may be made to implement some functionality with userspace drivers, > > > > just like we do for some HID and USB devices. > > > > > > > > Still I agree with you that adding new userspace API is something which > > > > needs to be considered carefully. So I will look at this closely when > > > > reviewing this set. > > > > > > To add to that: this was previously a debugfs interface but was moved to misc after review on the initial RFC: > > > https://lkml.org/lkml/2020/9/24/96 > > > > There is a huge difference between the suggestion and final implementation. > > > > Greg suggested to add new debug module to the drivers/misc that will > > open char device explicitly after user loaded that module to debug this > > hub. However, the author added full blown char device as a first citizen > > that has all not-break-user constrains. > > This module still needs to be loaded explicitly. And (I might be wrong > about this) the "not-break-user constraints" hold as soon as I register > a misc device at all, no? I don't think so, files in drivers/misc/* don't have such strict policy. > than previously discussed with Greg and b) how the uapi header now > introduces any not-break-user constraints that would not be there > without it. There is a huge difference between char device for the debug and exposed UAPI header. The first requires from the user to build and explicitly run it, while header allows to reliably build on top of it various applications that we don't control. The not-break-rule talks about the second. > > This interface is intended as a stable interface. That's something that > I committed to as soon as I decided to implement this via a misc-device. > > Sure, I can move the definitions in the uapi header to the module > itself, but I don't see any benefit in that. If someone really wants to > use this interface, they can just as well copy the definitions from the > module source itself. So why not be upfront about it and make life > easier for everyone? Because you are actually making life harder for everyone who cares about UAPIs exposed by the Linux and they definitely different in numbers from those who needs debug interface for the Microsoft Surface board. Thanks > > Regards, > Max >
On Sun, Dec 06, 2020 at 11:41:46AM +0100, Hans de Goede wrote: > Hi, > > On 12/6/20 11:33 AM, Leon Romanovsky wrote: > > On Sun, Dec 06, 2020 at 11:04:06AM +0100, Hans de Goede wrote: > > <snip> > > >> But there is a difference between being careful and just nacking > >> it because no new UAPI may be added at all (also see GKH's response). > > > > I saw, the author misunderstood the Greg's comments. > > Quoting from patch 8/9: > > " > +============================== > +User-Space EC Interface (cdev) > +============================== > + > +The ``surface_aggregator_cdev`` module provides a misc-device for the SSAM > +controller to allow for a (more or less) direct connection from user-space to > +the SAM EC. It is intended to be used for development and debugging, and > +therefore should not be used or relied upon in any other way. Note that this > +module is not loaded automatically, but instead must be loaded manually. > " > > If I'm not mistaken that seems to be pretty much what Greg asked for. Right, unless you forget the end of his request. " The "joy" of creating a user api is that no matter how much you tell people "do not depend on this", they will, so no matter the file being in debugfs, or a misc device, you might be stuck with it for forever, sorry. " So I still think that exposing user api for a development and debug of device that has no future is wrong thing to do. Thanks > > Regards, > > Hans >
On 12/6/20 12:30 PM, Leon Romanovsky wrote: > On Sun, Dec 06, 2020 at 11:33:40AM +0100, Maximilian Luz wrote: >> On 12/6/20 10:06 AM, Leon Romanovsky wrote:> On Sun, Dec 06, 2020 at 05:58:32PM +0900, Blaž Hrastnik wrote: >>>>> >>>>>> More on that, the whole purpose of proposed interface is to debug and >>>>>> not intended to be used by any user space code. >>>>> >>>>> The purpose is to provide raw access to the Surface Serial Hub protocol, >>>>> just like we provide raw access to USB devices and have hidraw devices. >>>>> >>>>> So this goes a litle beyond just debugging; and eventually the choice >>>>> may be made to implement some functionality with userspace drivers, >>>>> just like we do for some HID and USB devices. >>>>> >>>>> Still I agree with you that adding new userspace API is something which >>>>> needs to be considered carefully. So I will look at this closely when >>>>> reviewing this set. >>>> >>>> To add to that: this was previously a debugfs interface but was moved to misc after review on the initial RFC: >>>> https://lkml.org/lkml/2020/9/24/96 >>> >>> There is a huge difference between the suggestion and final implementation. >>> >>> Greg suggested to add new debug module to the drivers/misc that will >>> open char device explicitly after user loaded that module to debug this >>> hub. However, the author added full blown char device as a first citizen >>> that has all not-break-user constrains. >> >> This module still needs to be loaded explicitly. And (I might be wrong >> about this) the "not-break-user constraints" hold as soon as I register >> a misc device at all, no? > > I don't think so, files in drivers/misc/* don't have such strict policy. Can I get a link to the documentation stating that or someone else confirming that? Also I don't think it makes sense to have a platform/surface device in drivers/misc, after we've explicitly decided to move this code out of there. IIRC drivers/misc is not a place for misc-devices, but the directory for devices that don't have any good place elsewhere. >> than previously discussed with Greg and b) how the uapi header now >> introduces any not-break-user constraints that would not be there >> without it. > > There is a huge difference between char device for the debug and > exposed UAPI header. The first requires from the user to build and > explicitly run it, while header allows to reliably build on top of > it various applications that we don't control. The not-break-rule > talks about the second. So it's okay to break stuff that's not explicitly in include/uapi/? Again, can I get someone to confirm that for me? As already said, I'm okay with moving the definitions from the header to the module itself (if there is a consensus on that, CC Greg, Hans), however both allow you to build user-space tools against the API. Case in point my python scripts, which don't use the header. Or any other non-C-based tool. So unless there's a rule that anything without a header in uapi is fair game, I fail to see your point. >> This interface is intended as a stable interface. That's something that >> I committed to as soon as I decided to implement this via a misc-device. >> >> Sure, I can move the definitions in the uapi header to the module >> itself, but I don't see any benefit in that. If someone really wants to >> use this interface, they can just as well copy the definitions from the >> module source itself. So why not be upfront about it and make life >> easier for everyone? > > Because you are actually making life harder for everyone who cares about > UAPIs exposed by the Linux and they definitely different in numbers from > those who needs debug interface for the Microsoft Surface board. This point again depends on the interface not being stable. Unless, of course, you want me to remove the interface completely and just maintain it out of tree... I'm happy to be told otherwise by the authorities here, but from past conversations it seems that basically everything providing some sort of user-space access falls under the "don't break user-space" rule as soon as somebody uses it. Regards, Max
On 12/6/20 12:41 PM, Leon Romanovsky wrote: > On Sun, Dec 06, 2020 at 11:41:46AM +0100, Hans de Goede wrote: >> Hi, >> >> On 12/6/20 11:33 AM, Leon Romanovsky wrote: >>> On Sun, Dec 06, 2020 at 11:04:06AM +0100, Hans de Goede wrote: >> >> <snip> >> >>>> But there is a difference between being careful and just nacking >>>> it because no new UAPI may be added at all (also see GKH's response). >>> >>> I saw, the author misunderstood the Greg's comments. >> >> Quoting from patch 8/9: >> >> " >> +============================== >> +User-Space EC Interface (cdev) >> +============================== >> + >> +The ``surface_aggregator_cdev`` module provides a misc-device for the SSAM >> +controller to allow for a (more or less) direct connection from user-space to >> +the SAM EC. It is intended to be used for development and debugging, and >> +therefore should not be used or relied upon in any other way. Note that this >> +module is not loaded automatically, but instead must be loaded manually. >> " >> >> If I'm not mistaken that seems to be pretty much what Greg asked for. > > Right, unless you forget the end of his request. > " > The "joy" of creating a user api is that no matter how much you tell > people "do not depend on this", they will, so no matter the file being > in debugfs, or a misc device, you might be stuck with it for forever, > sorry. > " Which to me reads as "if you want a user-space interface for development and debugging, you'll have to make it a stable interface, regardless where it is implemented". Rather making a point for a well though-out stable interface. Specifically with regards to " > - So you suggest I go with a misc device instead of putting this into > debugfs? Yes. " Unless of course I'm misunderstanding things entirely. Greg, please feel free to yell at me if I've got this wrong. > So I still think that exposing user api for a development and debug of device > that has no future is wrong thing to do. Unless you know something that I don't, MS is rumored to come out with a new Surface Pro 8 and Surface Laptop 4 early next year, which I fully expect to also have this EC built in. And if they once again decided to move some functionality normally provided via ACPI or other means to the EC for some reason, we'll likely need that interface again. Yes, it has no future outside of Surface devices, but so has every other platform driver with respect to their specific platform. What are your alternatives to exposing a user API? If we want to be able to easily test and attempt to provide support for Surface devices, there has to be some interaction with user-space. With respect to stability of the interface and future changes, I believe that IOCTLs are the way to go. If that's in debugfs or, as was the result from the previous discussion about this, via a misc-device... I'll be happy to implement whatever a consensus yields, as long as it can be used for its intended purpose: aid development with regards to the EC found on Surface devices. Regards, Max
On 12/6/20 8:07 AM, Leon Romanovsky wrote: > On Thu, Dec 03, 2020 at 10:26:31PM +0100, Maximilian Luz wrote: >> Hello, >> >> Here is version two of the Surface System Aggregator Module (SAM/SSAM) >> driver series, adding initial support for the embedded controller on 5th >> and later generation Microsoft Surface devices. Initial support includes >> the ACPI interface to the controller, via which battery and thermal >> information is provided on some of these devices. >> >> The previous version and cover letter detailing what this series is >> about can be found at >> >> https://lore.kernel.org/platform-driver-x86/20201115192143.21571-1-luzmaximilian@gmail.com/ >> >> This patch-set can also be found at the following repository and >> reference, if you prefer to look at a kernel tree instead of these >> emails: >> >> https://github.com/linux-surface/kernel tags/s/surface-aggregator/v2 >> >> Thank you all for the feedback to v1, I hope I have addressed all >> comments. > > > I think that it is too far fetched to attempt and expose UAPI headers > for some obscure char device that we are all know won't be around in > a couple of years from now due to the nature of how this embedded world > works. > > More on that, the whole purpose of proposed interface is to debug and > not intended to be used by any user space code. I believe this has already been extensively discussed. I want to focus more on the part below in this response: > Also the idea that you are creating new bus just for this device doesn't > really sound right. I recommend you to take a look on auxiliary bus and > use it or come with very strong justifications why it is not fit yet. I tend to agree that this is a valid concern to bring up, and adding a new bus is not something that should be done lightly. Let's ignore that this has been merged into -next after I've submitted this (and that I only recently became aware of this) for the time being. If I would see a clear benefit, I would not hesitate to switch the driver and subsystem over to this. What does concern me most, is the device/driver matching by string. Right now, this subsystem matches those via a device UID. This UID is directly tied to the EC functionality provided by the device. A bit of background to this: Requests sent to the EC contain an address, so to say. This consists of - Target category (TC): Broad group of functionality, e.g. battery/AC, thermal, HID input, ..., i.e. a subsystem of sorts. - Target ID (TID): Some major device, e.g. the dual batteries on the Surface Book 3 are addressed by target ID 1 and 2, some functionality is only available at 2 and some only at 1. May be related to physical parts of/locations on the device. - Instance ID (IID): A device instance, e.g. for thermal sensors each sensor is at TC=0x03 (thermal) and has a different instance ID. Those can be used to pretty much uniquely identify a sub-device on the EC. Note the "pretty much". To truly make them unique we can add a function ID (FN). With that, we can for example match for TC=0x03, TID=*, IID=*, FN=0x00 to load a driver against all thermal sensors. And this is basically the device UID that the subsystem uses for matching (modulo domain for virtual devices, i.e. device hubs). Sure, we can use some string, but that then leads to having to come up with creative names once we need some driver specific data, e.g. in the battery driver [1]: const struct auxiliary_device_id my_auxiliary_id_table[] = { { .name = "surface_aggregator_registry.battery", .driver_data = x }, { .name = "surface_aggregator_registry.battery_sb3", .driver_data = y }, { }, } Arguably, not _that_ big of a deal. What worries me more is that this will block any path of auto-detecting devices on a more general/global level. Right now, we hard-code devices because we haven't found any way to detect them via some EC query yet [2] (FYI the node groups contain all devices that will eventually be added to the bus, which are already 11 devices on the Surface Book 3 without taking missing thermal sensors into account; also they are spread across a bunch of subsystems, so not just platform). That's of course not an ideal solution and one that I hope we can eventually fix. If we can auto-detect devices, it's very likely that we know or can easily get to the device UID. A meaningful string is somewhat more difficult. This registry, which is loaded against a platform device that, from what we can tell differentiates the models for some driver bindings by Windows (that's speculation), is also the reason why we don't register client devices directly under the main module, so instead of a nice "surface_aggregator.<devicename>", you'll get "surface_aggregator_registry.<devicename>". And it may not end there. Something that's currently not implemented is support for thermal sensors on 7th generation devices. With thermal sensors, we can already detect which sensors, i.e. which IIDs, are present. Naturally, that's part of the EC-API for thermal devices (TC=0x03), so would warrant a master driver that registers the individual sensor drivers (that's a place where I'd argue that in a normal situation, the auxiliary bus makes sense). So with the auxiliary bus we'd now end up with devices with "surface_thermal.sensor" for the sensors as well as "surface_aggregator_registry.<devicename>", both of type ssam_device (which then would be a wrapper around auxiliary_device with UID stored in that wrapper). Note that they need to be of type ssam_device (or another wrapper around that) as they again need the reference to the controller device, their UID for access, etc. With a proper bus, device, and the UID for matching, we can just add the sensor devices to the bus again, as they will have a meaningful and guaranteed unique UID. From some reports I've seen it looks like thermal sensors may also be available separately on TID=0x01 as well as TID=0x02 on some devices, at which point I believe you'd need to introduce some IDA for ID allocation to not cause a clash with IDs. At least if you separate the base drivers for each TC, which I guess should be preferred due to code-reuse. Then again they might use different event registries so you may end up needing "surface_thermal.sensor_tc1" and "surface_thermal.sensor_tc2" as device names to differentiate those for driver loading. Or store the registry in software node properties when registering the device. I'm repeating myself here, but to me it looks cleaner to have a single bus type as opposed to spreading the same base auxiliary device type over several namespaces. Which then leads me to the question of how a function like "is_ssam_device()", i.e. a function testing if the device is of a given type, would be implemented without enforcing and testing against some part of the device name. Something that, again, doesn't look clean to me. Although the use of such a function could probably avoided, but that then feels like working around the auxiliary bus. Unfortunately, there are a couple more hypotheticals at play than I'd like to have (making this not an easy decision), but it's a reverse engineered driver so I guess that comes with the territory. All in all, I believe it's possible to do this (i.e. use the auxiliary bus), but, to me at least, the implementation using a discrete bus feels tidier and more true to the hardware (or virtual hardware anyway) behind this. I'm happy to hear any arguments against this though. Regards, Max [1]: https://github.com/linux-surface/surface-aggregator-module/blob/61b9bb859c30a8e17654c3a06696feb2691438f7/module/src/clients/surface_battery.c#L1075-L1079 [2]: https://github.com/linux-surface/surface-aggregator-module/blob/61b9bb859c30a8e17654c3a06696feb2691438f7/module/src/clients/surface_aggregator_registry.c
On Sun, Dec 06, 2020 at 04:58:52PM +0100, Maximilian Luz wrote: > On 12/6/20 8:07 AM, Leon Romanovsky wrote: > > On Thu, Dec 03, 2020 at 10:26:31PM +0100, Maximilian Luz wrote: > > > Hello, > > > > > > Here is version two of the Surface System Aggregator Module (SAM/SSAM) > > > driver series, adding initial support for the embedded controller on 5th > > > and later generation Microsoft Surface devices. Initial support includes > > > the ACPI interface to the controller, via which battery and thermal > > > information is provided on some of these devices. > > > > > > The previous version and cover letter detailing what this series is > > > about can be found at > > > > > > https://lore.kernel.org/platform-driver-x86/20201115192143.21571-1-luzmaximilian@gmail.com/ > > > > > > This patch-set can also be found at the following repository and > > > reference, if you prefer to look at a kernel tree instead of these > > > emails: > > > > > > https://github.com/linux-surface/kernel tags/s/surface-aggregator/v2 > > > > > > Thank you all for the feedback to v1, I hope I have addressed all > > > comments. > > > > > > I think that it is too far fetched to attempt and expose UAPI headers > > for some obscure char device that we are all know won't be around in > > a couple of years from now due to the nature of how this embedded world > > works. > > > > More on that, the whole purpose of proposed interface is to debug and > > not intended to be used by any user space code. > > I believe this has already been extensively discussed. I want to focus > more on the part below in this response: > > > Also the idea that you are creating new bus just for this device doesn't > > really sound right. I recommend you to take a look on auxiliary bus and > > use it or come with very strong justifications why it is not fit yet. > > I tend to agree that this is a valid concern to bring up, and adding a > new bus is not something that should be done lightly. > > Let's ignore that this has been merged into -next after I've submitted > this (and that I only recently became aware of this) for the time being. > If I would see a clear benefit, I would not hesitate to switch the > driver and subsystem over to this. > > What does concern me most, is the device/driver matching by string. > Right now, this subsystem matches those via a device UID. This UID is > directly tied to the EC functionality provided by the device. A bit of > background to this: > > Requests sent to the EC contain an address, so to say. This consists of > > - Target category (TC): Broad group of functionality, e.g. battery/AC, > thermal, HID input, ..., i.e. a subsystem of sorts. > > - Target ID (TID): Some major device, e.g. the dual batteries on the > Surface Book 3 are addressed by target ID 1 and 2, some functionality > is only available at 2 and some only at 1. May be related to physical > parts of/locations on the device. > > - Instance ID (IID): A device instance, e.g. for thermal sensors each > sensor is at TC=0x03 (thermal) and has a different instance ID. > > Those can be used to pretty much uniquely identify a sub-device on the > EC. > > Note the "pretty much". To truly make them unique we can add a function > ID (FN). With that, we can for example match for TC=0x03, TID=*, IID=*, > FN=0x00 to load a driver against all thermal sensors. And this is > basically the device UID that the subsystem uses for matching (modulo > domain for virtual devices, i.e. device hubs). Sure, we can use some > string, but that then leads to having to come up with creative names > once we need some driver specific data, e.g. in the battery driver [1]: > > const struct auxiliary_device_id my_auxiliary_id_table[] = { > { .name = "surface_aggregator_registry.battery", .driver_data = x }, > { .name = "surface_aggregator_registry.battery_sb3", .driver_data = y }, > { }, > } > > Arguably, not _that_ big of a deal. > > What worries me more is that this will block any path of auto-detecting > devices on a more general/global level. Right now, we hard-code devices > because we haven't found any way to detect them via some EC query yet > [2] (FYI the node groups contain all devices that will eventually be > added to the bus, which are already 11 devices on the Surface Book 3 > without taking missing thermal sensors into account; also they are > spread across a bunch of subsystems, so not just platform). That's of > course not an ideal solution and one that I hope we can eventually fix. > If we can auto-detect devices, it's very likely that we know or can > easily get to the device UID. A meaningful string is somewhat more > difficult. > > This registry, which is loaded against a platform device that, from what > we can tell differentiates the models for some driver bindings by > Windows (that's speculation), is also the reason why we don't register > client devices directly under the main module, so instead of a nice > "surface_aggregator.<devicename>", you'll get > "surface_aggregator_registry.<devicename>". And it may not end there. > > Something that's currently not implemented is support for thermal > sensors on 7th generation devices. With thermal sensors, we can already > detect which sensors, i.e. which IIDs, are present. Naturally, that's > part of the EC-API for thermal devices (TC=0x03), so would warrant a > master driver that registers the individual sensor drivers (that's a > place where I'd argue that in a normal situation, the auxiliary bus > makes sense). So with the auxiliary bus we'd now end up with devices > with "surface_thermal.sensor" for the sensors as well as > "surface_aggregator_registry.<devicename>", both of type ssam_device > (which then would be a wrapper around auxiliary_device with UID stored > in that wrapper). Note that they need to be of type ssam_device (or > another wrapper around that) as they again need the reference to the > controller device, their UID for access, etc. With a proper bus, device, > and the UID for matching, we can just add the sensor devices to the bus > again, as they will have a meaningful and guaranteed unique UID. > > From some reports I've seen it looks like thermal sensors may also be > available separately on TID=0x01 as well as TID=0x02 on some devices, > at which point I believe you'd need to introduce some IDA for ID > allocation to not cause a clash with IDs. At least if you separate the > base drivers for each TC, which I guess should be preferred due to > code-reuse. Then again they might use different event registries so you > may end up needing "surface_thermal.sensor_tc1" and > "surface_thermal.sensor_tc2" as device names to differentiate those > for driver loading. Or store the registry in software node properties > when registering the device. > > I'm repeating myself here, but to me it looks cleaner to have a single > bus type as opposed to spreading the same base auxiliary device type > over several namespaces. > > Which then leads me to the question of how a function like > "is_ssam_device()", i.e. a function testing if the device is of a given > type, would be implemented without enforcing and testing against some > part of the device name. Something that, again, doesn't look clean to > me. Although the use of such a function could probably avoided, but that > then feels like working around the auxiliary bus. > > Unfortunately, there are a couple more hypotheticals at play than I'd > like to have (making this not an easy decision), but it's a reverse > engineered driver so I guess that comes with the territory. All in all, > I believe it's possible to do this (i.e. use the auxiliary bus), but, to > me at least, the implementation using a discrete bus feels tidier and > more true to the hardware (or virtual hardware anyway) behind this. I'm > happy to hear any arguments against this though. I'm not so certain why you are so focused on the names and UIDs. The names are simply the aliases and needed to match between major device driver that is responsible for the discovery and triggering sub-driver with implementation. Nothing prohibits from the logic: "if uid = 123; trigger "sensor"" Different devices and separation between them (*_registry module) are expected to be in that main driver. As an outcome of it, you will get: proper module autoload, device discover and power management. Thanks > > Regards, > Max > > [1]: https://github.com/linux-surface/surface-aggregator-module/blob/61b9bb859c30a8e17654c3a06696feb2691438f7/module/src/clients/surface_battery.c#L1075-L1079 > [2]: https://github.com/linux-surface/surface-aggregator-module/blob/61b9bb859c30a8e17654c3a06696feb2691438f7/module/src/clients/surface_aggregator_registry.c
Hi, On 12/6/20 4:58 PM, Maximilian Luz wrote: > On 12/6/20 8:07 AM, Leon Romanovsky wrote: >> On Thu, Dec 03, 2020 at 10:26:31PM +0100, Maximilian Luz wrote: >>> Hello, >>> >>> Here is version two of the Surface System Aggregator Module (SAM/SSAM) >>> driver series, adding initial support for the embedded controller on 5th >>> and later generation Microsoft Surface devices. Initial support includes >>> the ACPI interface to the controller, via which battery and thermal >>> information is provided on some of these devices. >>> >>> The previous version and cover letter detailing what this series is >>> about can be found at >>> >>> https://lore.kernel.org/platform-driver-x86/20201115192143.21571-1-luzmaximilian@gmail.com/ >>> >>> This patch-set can also be found at the following repository and >>> reference, if you prefer to look at a kernel tree instead of these >>> emails: >>> >>> https://github.com/linux-surface/kernel tags/s/surface-aggregator/v2 >>> >>> Thank you all for the feedback to v1, I hope I have addressed all >>> comments. >> >> >> I think that it is too far fetched to attempt and expose UAPI headers >> for some obscure char device that we are all know won't be around in >> a couple of years from now due to the nature of how this embedded world >> works. >> >> More on that, the whole purpose of proposed interface is to debug and >> not intended to be used by any user space code. > > I believe this has already been extensively discussed. I want to focus > more on the part below in this response: > >> Also the idea that you are creating new bus just for this device doesn't >> really sound right. I recommend you to take a look on auxiliary bus and >> use it or come with very strong justifications why it is not fit yet. > > I tend to agree that this is a valid concern to bring up, and adding a > new bus is not something that should be done lightly. > > Let's ignore that this has been merged into -next after I've submitted > this (and that I only recently became aware of this) for the time being. > If I would see a clear benefit, I would not hesitate to switch the > driver and subsystem over to this. > > What does concern me most, is the device/driver matching by string. > Right now, this subsystem matches those via a device UID. This UID is > directly tied to the EC functionality provided by the device. A bit of > background to this: > > Requests sent to the EC contain an address, so to say. This consists of > > - Target category (TC): Broad group of functionality, e.g. battery/AC, > thermal, HID input, ..., i.e. a subsystem of sorts. > > - Target ID (TID): Some major device, e.g. the dual batteries on the > Surface Book 3 are addressed by target ID 1 and 2, some functionality > is only available at 2 and some only at 1. May be related to physical > parts of/locations on the device. > > - Instance ID (IID): A device instance, e.g. for thermal sensors each > sensor is at TC=0x03 (thermal) and has a different instance ID. > > Those can be used to pretty much uniquely identify a sub-device on the > EC. Thank you for this explanation, that is going to be useful to know when I get around to reviewing this set (although I guess that you probably also have written this down in one of the commit msgs / docs I did not check). > > Note the "pretty much". To truly make them unique we can add a function > ID (FN). With that, we can for example match for TC=0x03, TID=*, IID=*, > FN=0x00 to load a driver against all thermal sensors. And this is > basically the device UID that the subsystem uses for matching (modulo > domain for virtual devices, i.e. device hubs). Sure, we can use some > string, but that then leads to having to come up with creative names > once we need some driver specific data, e.g. in the battery driver [1]: > > const struct auxiliary_device_id my_auxiliary_id_table[] = { > { .name = "surface_aggregator_registry.battery", .driver_data = x }, > { .name = "surface_aggregator_registry.battery_sb3", .driver_data = y }, > { }, > } > > Arguably, not _that_ big of a deal. > > What worries me more is that this will block any path of auto-detecting > devices on a more general/global level. Right now, we hard-code devices > because we haven't found any way to detect them via some EC query yet > [2] (FYI the node groups contain all devices that will eventually be > added to the bus, which are already 11 devices on the Surface Book 3 > without taking missing thermal sensors into account; also they are > spread across a bunch of subsystems, so not just platform). That's of > course not an ideal solution and one that I hope we can eventually fix. > If we can auto-detect devices, it's very likely that we know or can > easily get to the device UID. A meaningful string is somewhat more > difficult. > > This registry, which is loaded against a platform device that, from what > we can tell differentiates the models for some driver bindings by > Windows (that's speculation), is also the reason why we don't register > client devices directly under the main module, so instead of a nice > "surface_aggregator.<devicename>", you'll get > "surface_aggregator_registry.<devicename>". And it may not end there. > > Something that's currently not implemented is support for thermal > sensors on 7th generation devices. With thermal sensors, we can already > detect which sensors, i.e. which IIDs, are present. Naturally, that's > part of the EC-API for thermal devices (TC=0x03), so would warrant a > master driver that registers the individual sensor drivers (that's a > place where I'd argue that in a normal situation, the auxiliary bus > makes sense). So with the auxiliary bus we'd now end up with devices > with "surface_thermal.sensor" for the sensors as well as > "surface_aggregator_registry.<devicename>", both of type ssam_device > (which then would be a wrapper around auxiliary_device with UID stored > in that wrapper). Note that they need to be of type ssam_device (or > another wrapper around that) as they again need the reference to the > controller device, their UID for access, etc. With a proper bus, device, > and the UID for matching, we can just add the sensor devices to the bus > again, as they will have a meaningful and guaranteed unique UID. > > From some reports I've seen it looks like thermal sensors may also be > available separately on TID=0x01 as well as TID=0x02 on some devices, > at which point I believe you'd need to introduce some IDA for ID > allocation to not cause a clash with IDs. At least if you separate the > base drivers for each TC, which I guess should be preferred due to > code-reuse. Then again they might use different event registries so you > may end up needing "surface_thermal.sensor_tc1" and > "surface_thermal.sensor_tc2" as device names to differentiate those > for driver loading. Or store the registry in software node properties > when registering the device. > > I'm repeating myself here, but to me it looks cleaner to have a single > bus type as opposed to spreading the same base auxiliary device type > over several namespaces. > > Which then leads me to the question of how a function like > "is_ssam_device()", i.e. a function testing if the device is of a given > type, would be implemented without enforcing and testing against some > part of the device name. Something that, again, doesn't look clean to > me. Although the use of such a function could probably avoided, but that > then feels like working around the auxiliary bus. > > Unfortunately, there are a couple more hypotheticals at play than I'd > like to have (making this not an easy decision), but it's a reverse > engineered driver so I guess that comes with the territory. All in all, > I believe it's possible to do this (i.e. use the auxiliary bus), but, to > me at least, the implementation using a discrete bus feels tidier and > more true to the hardware (or virtual hardware anyway) behind this. I'm > happy to hear any arguments against this though. I agree, the whole setup with the TC + TID + IID feels like the functionality is nicely (and cleanly) split into separate functions and as with other busses using a bus + 1 device per function for this is a perfectly clean way to handle this. Note if in the future you do see benefit in switching the auxiliary bus I have no problems with that. But atm I don't really see any benefits of doing so, so then we would just be switching over for the sake of switching over which does not seem productive. Regards, Hans
On Mon, Dec 07, 2020 at 09:49:03AM +0100, Hans de Goede wrote: > Note if in the future you do see benefit in switching the auxiliary bus > I have no problems with that. But atm I don't really see any benefits of > doing so, so then we would just be switching over for the sake of switching > over which does not seem productive. I too do not see the benefit at this time to switch either. thanks, greg k-h
On 12/8/20 2:01 PM, Hans de Goede wrote: > Hi, > > Thank you for your patch series. > > Full review done, I have a couple of minor review remarks inline below. > > Note these are really all suggestions and not must fix for this to > be merged items. Thank you! I will try to address and fix them. [...] >> +/** >> + * ssam_event_matches_notifier() - Test if an event matches a notifier. >> + * @n: The event notifier to test against. >> + * @event: The event to test. >> + * >> + * Return: Returns %true iff the given event matches the given notifier > > s/iff/if/ Ack. >> +/** >> + * ssam_nf_refcount_inc() - Increment reference-/activation-count of the given >> + * event. >> + * @nf: The notifier system reference. >> + * @reg: The registry used to enable/disable the event. >> + * @id: The event ID. >> + * >> + * Increments the reference-/activation-count associated with the specified >> + * event type/ID, allocating a new entry for this event ID if necessary. A >> + * newly allocated entry will have a refcount of one. >> + * >> + * Note: Must be synchronized by the caller with regards to other >> + * ssam_nf_refcount_inc() and ssam_nf_refcount_dec() calls, e.g. via >> + * ``nf->lock``. Note that this lock should also be used to ensure the >> + * corresponding EC requests are sent, if necessary. > > You write "e.g. via ``nf->lock``", are any other locking strategies used > in the other patches ? If not then it might be good to change this to just > stating that nf->lock must be held and adding an assert for that. I would > prefer a clear set of locking rules like this over rules like: > "must be synchronized ...". There are no other locking strategies here. I've worded it this way as "nf->lock" is mainly intended to synchronize the enable requests and is just by extension also used to synchronize this (i.e. the lock is on a different abstraction layer). I agree that for clarity it makes sense to directly specify "nf->lock", so I will re-write this to just say that "nf->lock" must be held. > Note I am naively assuming here that it is possible to come up with such a > clear set of locking rules, I'm not sure if that is the case. > > Note in case it is possible to always take nf->lock but you are not doing > so in some places because of performance concerns then I would prefer to > move to a model where the caller simply always takes nf->lock. AFAICT non > of the events/data going through the SSH are high frequency (iow > none of them occur at a high rate of see 50 times / second or more) so > I don't think that we need to focus too much on optimizing things for > speed. The registration/deregistration functions are all synchronized via "nf->lock". I also don't consider that path critical. Event handling (i.e. the actual critical path) is done via a RCU list (reasoning behind that is mostly for touchpad input events). >> + * >> + * Return: Returns the refcount entry on success. Returns an error pointer >> + * with %-ENOSPC if there have already been %INT_MAX events of the specified >> + * ID and type registered, or %-ENOMEM if the entry could not be allocated. >> + */ >> +static struct ssam_nf_refcount_entry >> +*ssam_nf_refcount_inc(struct ssam_nf *nf, struct ssam_event_registry reg, >> + struct ssam_event_id id) >> +{ >> + struct ssam_nf_refcount_entry *entry; >> + struct ssam_nf_refcount_key key; >> + struct rb_node **link = &nf->refcount.rb_node; >> + struct rb_node *parent = NULL; >> + int cmp; >> + >> + key.reg = reg; >> + key.id = id; >> + >> + while (*link) { >> + entry = rb_entry(*link, struct ssam_nf_refcount_entry, node); >> + parent = *link; >> + >> + cmp = memcmp(&key, &entry->key, sizeof(key)); >> + if (cmp < 0) { >> + link = &(*link)->rb_left; >> + } else if (cmp > 0) { >> + link = &(*link)->rb_right; >> + } else if (entry->refcount < INT_MAX) { >> + entry->refcount++; >> + return entry; >> + } else { > > So we hit this only if entry->refcount == INT_MAX, right? > > That seems like something which should never happen, right? > > So maybe add a: > > WARN_ON(1); > > here? Yes and yes. That seems like a good idea, I'll do that. >> + return ERR_PTR(-ENOSPC); >> + } >> + } >> + >> + entry = kzalloc(sizeof(*entry), GFP_KERNEL); >> + if (!entry) >> + return ERR_PTR(-ENOMEM); >> + >> + entry->key = key; >> + entry->refcount = 1; >> + >> + rb_link_node(&entry->node, parent, link); >> + rb_insert_color(&entry->node, &nf->refcount); >> + >> + return entry; >> +} [...] >> +/** >> + * ssam_nf_call() - Call notification callbacks for the provided event. >> + * @nf: The notifier system >> + * @dev: The associated device, only used for logging. >> + * @rqid: The request ID of the event. >> + * @event: The event provided to the callbacks. >> + * >> + * Execute registered callbacks in order of their priority until either no >> + * callback is left or a callback returns a value with the %SSAM_NOTIF_STOP >> + * bit set. Note that this bit is set automatically when converting non-zero >> + * error values via ssam_notifier_from_errno() to notifier values. >> + * >> + * Also note that any callback that could handle an event should return a value >> + * with bit %SSAM_NOTIF_HANDLED set, indicating that the event does not go >> + * unhandled/ignored. In case no registered callback could handle an event, >> + * this function will emit a warning. >> + * >> + * In case a callback failed, this function will emit an error message. >> + */ >> +static void ssam_nf_call(struct ssam_nf *nf, struct device *dev, u16 rqid, >> + struct ssam_event *event) >> +{ >> + struct ssam_nf_head *nf_head; >> + int status, nf_ret; >> + >> + if (!ssh_rqid_is_event(rqid)) { >> + dev_warn(dev, "event: unsupported rqid: %#06x\n", rqid); >> + return; >> + } >> + >> + nf_head = &nf->head[ssh_rqid_to_event(rqid)]; >> + nf_ret = ssam_nfblk_call_chain(nf_head, event); >> + status = ssam_notifier_to_errno(nf_ret); >> + >> + if (status < 0) { >> + dev_err(dev, >> + "event: error handling event: %d (tc: %#04x, tid: %#04x, cid: %#04x, iid: %#04x)\n", >> + status, event->target_category, event->target_id, >> + event->command_id, event->instance_id); >> + } >> + >> + if (!(nf_ret & SSAM_NOTIF_HANDLED)) { > > Maybe use "else if" here so that on an error we don't emit both the error > and the unhandled event warning ? Right, makes sense. An error should be enough. >> + dev_warn(dev, >> + "event: unhandled event (rqid: %#04x, tc: %#04x, tid: %#04x, cid: %#04x, iid: %#04x)\n", >> + rqid, event->target_category, event->target_id, >> + event->command_id, event->instance_id); >> + } >> +} >> + [...] >> +static int ssam_dsm_get_functions(acpi_handle handle, u64 *funcs) >> +{ >> + union acpi_object *obj; >> + u64 mask = 0; >> + int i; >> + >> + *funcs = 0; >> + >> + if (!acpi_has_method(handle, "_DSM")) >> + return 0; > > Is this expected on some models? It is, yes. Specifically: Older models don't have that, newer do. > If yes then maybe add a comment. Right. I'll have to look at the DSDTs to check in which generation specifically that _DSM was introduced and add that as a comment. > If not then I think this can be dropped as acpi_evaluate_dsm_typed() > will then error out with -EIO which seems better then return 0 > (while this is not expected). > >> + >> + obj = acpi_evaluate_dsm_typed(handle, &SSAM_SSH_DSM_GUID, >> + SSAM_SSH_DSM_REVISION, 0, NULL, >> + ACPI_TYPE_BUFFER); >> + if (!obj) >> + return -EIO; >> + >> + for (i = 0; i < obj->buffer.length && i < 8; i++) >> + mask |= (((u64)obj->buffer.pointer[i]) << (i * 8)); >> + >> + if (mask & BIT(0)) >> + *funcs = mask; >> + >> + ACPI_FREE(obj); >> + return 0; >> +} >> + >> +static int ssam_dsm_load_u32(acpi_handle handle, u64 funcs, u64 func, u32 *ret) >> +{ >> + union acpi_object *obj; >> + u64 val; >> +> + if (!(funcs & BIT(func))) >> + return 0; > > This exit path leaves *ret uninitialized, looking at the usage if this > function I see that this is intentional, but it did stood out to me, so maybe > add a comment like this ? : > > if (!(funcs & BIT(func))) > return 0; /* Not supported leave *ret at its default value */ Right, will do that. >> + >> + obj = acpi_evaluate_dsm_typed(handle, &SSAM_SSH_DSM_GUID, >> + SSAM_SSH_DSM_REVISION, func, NULL, >> + ACPI_TYPE_INTEGER); >> + if (!obj) >> + return -EIO; >> + >> + val = obj->integer.value; >> + ACPI_FREE(obj); >> + >> + if (val > U32_MAX) >> + return -ERANGE; >> + >> + *ret = val; >> + return 0; >> +} [...] >> +/** >> + * ssam_controller_start() - Start the receiver and transmitter threads of the >> + * controller. >> + * @ctrl: The controller. >> + * >> + * Note: When this function is called, the controller should be properly >> + * hooked up to the serdev core via &struct serdev_device_ops. Please refer >> + * to ssam_controller_init() for more details on controller initialization. >> + * >> + * This function must be called from an exclusive context with regards to the >> + * state, if necessary, by locking the controller via ssam_controller_lock(). > > Again you are being a bit hand-wavy (I assume you know what I mean by that) > wrt the locking requirements. If possible I would prefer clearly spelled out > locking requirements in the form of "this and that lock must be held when > calling this function". Preferably backed-up by lockdep_assert-s asserting > these conditions. The reason for this is that this function specifically is currently only called during initialization, when the controller has not been published yet, i.e. when we have an exclusive reference to the controller. I'll change this to fully enforce locking (with lockdep_assert). > And maybe if you are a bit stricter with always holding the lock when > calling this, you can also drop the WRITE_ONCE and the comment about it > (in all places where you do this). The WRITE_ONCE is only there to ensure that the basic test in ssam_request_sync_submit() can be done. I always try to be explicit about access that can happen without the respective locks being held. Unfortunately it's not feasible to hold the reader lock in ssam_request_sync_submit() due to reentrancy. Specifically, as the lock, if at all (i.e. if this is not a client driver bound to the controller), must be held not only during submission but until the request has been completed. Note that if we would hold the lock during submission, this is just a smoke-test. > Note, that like my remarks around the locking docs for > ssam_nf_refcount_inc() this assumes that it is possible to come > up with such a clear set of locking rules. If that for some reason > is not straight forwardd, then maybe add some docs documenting the > locking expectations somewhere instead ? It's possible to fully enforce this here, so I'll do that. >> + */ >> +int ssam_controller_start(struct ssam_controller *ctrl) >> +{ >> + int status; >> + >> + if (ctrl->state != SSAM_CONTROLLER_INITIALIZED) >> + return -EINVAL; >> + >> + status = ssh_rtl_start(&ctrl->rtl); >> + if (status) >> + return status; >> + >> + /* >> + * Set state via write_once even though we expect to be locked/in an >> + * exclusive context, due to smoke-testing in >> + * ssam_request_sync_submit(). >> + */ >> + WRITE_ONCE(ctrl->state, SSAM_CONTROLLER_STARTED); >> + return 0; >> +} >> + >> +/* >> + * SSAM_CTRL_SHUTDOWN_FLUSH_TIMEOUT - Timeout for flushing requests during >> + * shutdown. >> + * >> + * Chosen to be larger than one full request timeout, including packets timing >> + * out. This value should give ample time to complete any outstanding requests >> + * during normal operation and account for the odd package timeout. >> + */ >> +#define SSAM_CTRL_SHUTDOWN_FLUSH_TIMEOUT msecs_to_jiffies(5000) >> + >> +/** >> + * ssam_controller_shutdown() - Shut down the controller. >> + * @ctrl: The controller. >> + * >> + * Shuts down the controller by flushing all pending requests and stopping the >> + * transmitter and receiver threads. All requests submitted after this call >> + * will fail with %-ESHUTDOWN. While it is discouraged to do so, this function >> + * is safe to use in parallel with ongoing request submission. >> + * >> + * In the course of this shutdown procedure, all currently registered >> + * notifiers will be unregistered. It is, however, strongly recommended to not >> + * rely on this behavior, and instead the party registering the notifier >> + * should unregister it before the controller gets shut down, e.g. via the >> + * SSAM bus which guarantees client devices to be removed before a shutdown. >> + * >> + * Note that events may still be pending after this call, but, due to the >> + * notifiers being unregistered, these events will be dropped when the >> + * controller is subsequently destroyed via ssam_controller_destroy(). >> + * >> + * This function must be called from an exclusive context with regards to the >> + * state, if necessary, by locking the controller via ssam_controller_lock(). > > Same comment as earlier wrt the locking. Similar to the above, this can be currently called outside of the lock during initialization failure. I'll change that to fully enforce locking too. >> + */ >> +void ssam_controller_shutdown(struct ssam_controller *ctrl) >> +{ >> + enum ssam_controller_state s = ctrl->state; >> + int status; >> + >> + if (s == SSAM_CONTROLLER_UNINITIALIZED || s == SSAM_CONTROLLER_STOPPED) >> + return; >> + >> + /* >> + * Try to flush pending events and requests while everything still >> + * works. Note: There may still be packets and/or requests in the >> + * system after this call (e.g. via control packets submitted by the >> + * packet transport layer or flush timeout / failure, ...). Those will >> + * be handled with the ssh_rtl_shutdown() call below. >> + */ >> + status = ssh_rtl_flush(&ctrl->rtl, SSAM_CTRL_SHUTDOWN_FLUSH_TIMEOUT); >> + if (status) { >> + ssam_err(ctrl, "failed to flush request transport layer: %d\n", >> + status); >> + } >> + >> + /* Try to flush all currently completing requests and events. */ >> + ssam_cplt_flush(&ctrl->cplt); >> + >> + /* >> + * We expect all notifiers to have been removed by the respective client >> + * driver that set them up at this point. If this warning occurs, some >> + * client driver has not done that... >> + */ >> + WARN_ON(!ssam_notifier_is_empty(ctrl)); >> + >> + /* >> + * Nevertheless, we should still take care of drivers that don't behave >> + * well. Thus disable all enabled events, unregister all notifiers. >> + */ >> + ssam_notifier_unregister_all(ctrl); >> + >> + /* >> + * Cancel remaining requests. Ensure no new ones can be queued and stop >> + * threads. >> + */ >> + ssh_rtl_shutdown(&ctrl->rtl); >> + >> + /* >> + * Set state via write_once even though we expect to be locked/in an >> + * exclusive context, due to smoke-testing in >> + * ssam_request_sync_submit(). >> + */ >> + WRITE_ONCE(ctrl->state, SSAM_CONTROLLER_STOPPED); >> + ctrl->rtl.ptl.serdev = NULL; >> +} >> + >> +/** >> + * ssam_controller_destroy() - Destroy the controller and free its resources. >> + * @ctrl: The controller. >> + * >> + * Ensures that all resources associated with the controller get freed. This >> + * function should only be called after the controller has been stopped via >> + * ssam_controller_shutdown(). In general, this function should not be called >> + * directly. The only valid place to call this function directly is during >> + * initialization, before the controller has been fully initialized and passed >> + * to other processes. This function is called automatically when the >> + * reference count of the controller reaches zero. >> + * >> + * Must be called from an exclusive context with regards to the controller >> + * state. > > Same comment as earlier wrt the locking. Right, I'll change that as well. >> + */ >> +void ssam_controller_destroy(struct ssam_controller *ctrl) >> +{ >> + if (ctrl->state == SSAM_CONTROLLER_UNINITIALIZED) >> + return; >> + >> + WARN_ON(ctrl->state != SSAM_CONTROLLER_STOPPED); >> + >> + /* >> + * Note: New events could still have been received after the previous >> + * flush in ssam_controller_shutdown, before the request transport layer >> + * has been shut down. At this point, after the shutdown, we can be sure >> + * that no new events will be queued. The call to ssam_cplt_destroy will >> + * ensure that those remaining are being completed and freed. >> + */ >> + >> + /* Actually free resources. */ >> + ssam_cplt_destroy(&ctrl->cplt); >> + ssh_rtl_destroy(&ctrl->rtl); >> + >> + /* >> + * Set state via write_once even though we expect to be locked/in an >> + * exclusive context, due to smoke-testing in >> + * ssam_request_sync_submit(). >> + */ >> + WRITE_ONCE(ctrl->state, SSAM_CONTROLLER_UNINITIALIZED); >> +} >> + [...] >> +/* Must be called with queue lock held. */ >> +static struct list_head *__ssh_ptl_queue_find_entrypoint(struct ssh_packet *p) >> +{ >> + struct list_head *head; >> + struct ssh_packet *q; >> + >> + /* >> + * We generally assume that there are less control (ACK/NAK) packets >> + * and re-submitted data packets as there are normal data packets (at >> + * least in situations in which many packets are queued; if there >> + * aren't many packets queued the decision on how to iterate should be >> + * basically irrelevant; the number of control/data packets is more or >> + * less limited via the maximum number of pending packets). Thus, when >> + * inserting a control or re-submitted data packet, (determined by >> + * their priority), we search from front to back. Normal data packets >> + * are, usually queued directly at the tail of the queue, so for those >> + * search from back to front. >> + */ >> + >> + if (p->priority > SSH_PACKET_PRIORITY(DATA, 0)) { >> + list_for_each(head, &p->ptl->queue.head) { >> + q = list_entry(head, struct ssh_packet, queue_node); >> + >> + if (q->priority < p->priority) >> + break; >> + } >> + } else { >> + list_for_each_prev(head, &p->ptl->queue.head) { >> + q = list_entry(head, struct ssh_packet, queue_node); >> + >> + if (q->priority >= p->priority) { >> + head = head->next; >> + break; >> + } >> + } >> + } >> + >> + return head; >> +} >> + >> +/* Must be called with queue lock held. */ > > Maybe add a lockdep_assert for this ? > > (and the same for a bunch of similar cases below) Right, makes sense. I'll do that. >> +static int __ssh_ptl_queue_push(struct ssh_packet *packet) >> +{ >> + struct ssh_ptl *ptl = packet->ptl; >> + struct list_head *head; >> + >> + if (test_bit(SSH_PTL_SF_SHUTDOWN_BIT, &ptl->state)) >> + return -ESHUTDOWN; >> + >> + /* Avoid further transitions when canceling/completing. */ >> + if (test_bit(SSH_PACKET_SF_LOCKED_BIT, &packet->state)) >> + return -EINVAL; >> + >> + /* If this packet has already been queued, do not add it. */ >> + if (test_and_set_bit(SSH_PACKET_SF_QUEUED_BIT, &packet->state)) >> + return -EALREADY; >> + >> + head = __ssh_ptl_queue_find_entrypoint(packet); >> + >> + list_add_tail(&ssh_packet_get(packet)->queue_node, head); >> + return 0; >> +} > > <snip about 5000 more lines> > > Snipped because nothing stood out in the rest of this patch. > > Phew that was one large patch to review. I've run out of review > bandwidth for today, so I will review the rest of the series > later (probably coming Thursday). I'm sorry about the size of this. I've tried to split out as much as possible (caching, tracing, error injection) but couldn't get it smaller than this while also having something functional. Please do review this at your own pace. There is no need to hurry. Thank you for your review! Regards, Max
Hi, On 12/8/20 3:37 PM, Maximilian Luz wrote: <snip> >>> + >>> + obj = acpi_evaluate_dsm_typed(handle, &SSAM_SSH_DSM_GUID, >>> + SSAM_SSH_DSM_REVISION, func, NULL, >>> + ACPI_TYPE_INTEGER); >>> + if (!obj) >>> + return -EIO; >>> + >>> + val = obj->integer.value; >>> + ACPI_FREE(obj); >>> + >>> + if (val > U32_MAX) >>> + return -ERANGE; >>> + >>> + *ret = val; >>> + return 0; >>> +} > > [...] > >>> +/** >>> + * ssam_controller_start() - Start the receiver and transmitter threads of the >>> + * controller. >>> + * @ctrl: The controller. >>> + * >>> + * Note: When this function is called, the controller should be properly >>> + * hooked up to the serdev core via &struct serdev_device_ops. Please refer >>> + * to ssam_controller_init() for more details on controller initialization. >>> + * >>> + * This function must be called from an exclusive context with regards to the >>> + * state, if necessary, by locking the controller via ssam_controller_lock(). >> >> Again you are being a bit hand-wavy (I assume you know what I mean by that) >> wrt the locking requirements. If possible I would prefer clearly spelled out >> locking requirements in the form of "this and that lock must be held when >> calling this function". Preferably backed-up by lockdep_assert-s asserting >> these conditions. > > The reason for this is that this function specifically is currently only > called during initialization, when the controller has not been published > yet, i.e. when we have an exclusive reference to the controller. > > I'll change this to fully enforce locking (with lockdep_assert). > >> And maybe if you are a bit stricter with always holding the lock when >> calling this, you can also drop the WRITE_ONCE and the comment about it >> (in all places where you do this). > > The WRITE_ONCE is only there to ensure that the basic test in > ssam_request_sync_submit() can be done. I always try to be explicit > about access that can happen without the respective locks being held. Yes I saw the matching READ_ONCE later on (as the comment indicated I would), which made it more obvious to me why the WRITE_ONCE is here,' so maybe I should have gone back and updated this comment. Anyways, keeping the WRITE_ONCE + READ_ONCE for this is fine. > Unfortunately it's not feasible to hold the reader lock in > ssam_request_sync_submit() due to reentrancy. Specifically, as the lock, > if at all (i.e. if this is not a client driver bound to the controller), > must be held not only during submission but until the request has been > completed. Note that if we would hold the lock during submission, this > is just a smoke-test. Ack. <more snip> Regards, Hans
On 12/8/20 3:43 PM, Hans de Goede wrote: > Hi, > > On 12/8/20 3:37 PM, Maximilian Luz wrote: > > <snip> > >>>> + >>>> + obj = acpi_evaluate_dsm_typed(handle, &SSAM_SSH_DSM_GUID, >>>> + SSAM_SSH_DSM_REVISION, func, NULL, >>>> + ACPI_TYPE_INTEGER); >>>> + if (!obj) >>>> + return -EIO; >>>> + >>>> + val = obj->integer.value; >>>> + ACPI_FREE(obj); >>>> + >>>> + if (val > U32_MAX) >>>> + return -ERANGE; >>>> + >>>> + *ret = val; >>>> + return 0; >>>> +} >> >> [...] >> >>>> +/** >>>> + * ssam_controller_start() - Start the receiver and transmitter threads of the >>>> + * controller. >>>> + * @ctrl: The controller. >>>> + * >>>> + * Note: When this function is called, the controller should be properly >>>> + * hooked up to the serdev core via &struct serdev_device_ops. Please refer >>>> + * to ssam_controller_init() for more details on controller initialization. >>>> + * >>>> + * This function must be called from an exclusive context with regards to the >>>> + * state, if necessary, by locking the controller via ssam_controller_lock(). >>> >>> Again you are being a bit hand-wavy (I assume you know what I mean by that) >>> wrt the locking requirements. If possible I would prefer clearly spelled out >>> locking requirements in the form of "this and that lock must be held when >>> calling this function". Preferably backed-up by lockdep_assert-s asserting >>> these conditions. >> >> The reason for this is that this function specifically is currently only >> called during initialization, when the controller has not been published >> yet, i.e. when we have an exclusive reference to the controller. >> >> I'll change this to fully enforce locking (with lockdep_assert). >> >>> And maybe if you are a bit stricter with always holding the lock when >>> calling this, you can also drop the WRITE_ONCE and the comment about it >>> (in all places where you do this). >> >> The WRITE_ONCE is only there to ensure that the basic test in >> ssam_request_sync_submit() can be done. I always try to be explicit >> about access that can happen without the respective locks being held. > > Yes I saw the matching READ_ONCE later on (as the comment indicated > I would), which made it more obvious to me why the WRITE_ONCE is here,' > so maybe I should have gone back and updated this comment. No worries, always good to have another look at these kinds of things. > Anyways, keeping the WRITE_ONCE + READ_ONCE for this is fine. > >> Unfortunately it's not feasible to hold the reader lock in >> ssam_request_sync_submit() due to reentrancy. Specifically, as the lock, >> if at all (i.e. if this is not a client driver bound to the controller), >> must be held not only during submission but until the request has been >> completed. Note that if we would hold the lock during submission, this >> is just a smoke-test. > > Ack. > > <more snip> > > Regards, > > Hans >