mbox series

[v2,00/17] Refactor fw_devlink to significantly improve boot time

Message ID 20201121020232.908850-1-saravanak@google.com (mailing list archive)
Headers show
Series Refactor fw_devlink to significantly improve boot time | expand

Message

Saravana Kannan Nov. 21, 2020, 2:02 a.m. UTC
The current implementation of fw_devlink is very inefficient because it
tries to get away without creating fwnode links in the name of saving
memory usage. Past attempts to optimize runtime at the cost of memory
usage were blocked with request for data showing that the optimization
made significant improvement for real world scenarios.

We have those scenarios now. There have been several reports of boot
time increase in the order of seconds in this thread [1]. Several OEMs
and SoC manufacturers have also privately reported significant
(350-400ms) increase in boot time due to all the parsing done by
fw_devlink.

So this patch series refactors fw_devlink to be more efficient. The key
difference now is the addition of support for fwnode links -- just a few
simple APIs. This also allows most of the code to be moved out of
firmware specific (DT mostly) code into driver core.

This brings the following benefits:
- Instead of parsing the device tree multiple times (complexity was
  close to O(N^3) where N in the number of properties) during bootup,
  fw_devlink parses each fwnode node/property only once and creates
  fwnode links. The rest of the fw_devlink code then just looks at these
  fwnode links to do rest of the work.

- Makes it much easier to debug probe issue due to fw_devlink in the
  future. fw_devlink=on blocks the probing of devices if they depend on
  a device that hasn't been added yet. With this refactor, it'll be very
  easy to tell what that device is because we now have a reference to
  the fwnode of the device.

- Much easier to add fw_devlink support to ACPI and other firmware
  types. A refactor to move the common bits from DT specific code to
  driver core was in my TODO list as a prerequisite to adding ACPI
  support to fw_devlink. This series gets that done.

Laurent and Grygorii tested the v1 series and they saw boot time
improvment of about 12 seconds and 3 seconds, respectively.

Thanks,
Saravana

[1] - https://lore.kernel.org/linux-pm/CAGETcx-aiW251dhEXT1GNb9bi6YcX8W=jLBrro5CnPuEjGL09g@mail.gmail.com/

v1 -> v2:
Patches 1-6:
- Added the "why" to explain the reverts.
Patch 7:
- Fixed white space comment.
Patch 8:
- Reworded commit text and some function doc.
Patch 11:
- Fixed the build warning this patch would cause by removing a "const".
Patch 12:
- Added/updated documentation.
- Changed flags from u32 to u8.
Patch 13:
- Squashed with Patch 10. Will use v1 patch number for the rest of the diff
  descriptions.
Patch 15:
- Removed an unnecessary unlikely()
Patch 17:
- Refactored fw_devlink_create_devlink() to flip the error handling vs
  successful paths.
Patch 18:
- Squashed into Patch 17 as requested by Greg.
- Added Tested-by: tags from Laurent and Grygorii.
New Patch 17:
- New patch to delete useless input to add_links()

Saravana Kannan (17):
  Revert "driver core: Avoid deferred probe due to
    fw_devlink_pause/resume()"
  Revert "driver core: Rename dev_links_info.defer_sync to defer_hook"
  Revert "driver core: Don't do deferred probe in parallel with
    kernel_init thread"
  Revert "driver core: Remove check in
    driver_deferred_probe_force_trigger()"
  Revert "of: platform: Batch fwnode parsing when adding all top level
    devices"
  Revert "driver core: fw_devlink: Add support for batching fwnode
    parsing"
  driver core: Add fwnode_init()
  driver core: Add fwnode link support
  driver core: Allow only unprobed consumers for SYNC_STATE_ONLY device
    links
  device property: Add fwnode_is_ancestor_of() and
    fwnode_get_next_parent_dev()
  driver core: Redefine the meaning of fwnode_operations.add_links()
  driver core: Add fw_devlink_parse_fwtree()
  driver core: Use device's fwnode to check if it is waiting for
    suppliers
  of: property: Update implementation of add_links() to create fwnode
    links
  efi: Update implementation of add_links() to create fwnode links
  driver core: Refactor fw_devlink feature
  driver core: Delete pointless parameter in fwnode_operations.add_links

 drivers/acpi/property.c         |   2 +-
 drivers/acpi/scan.c             |   2 +-
 drivers/base/core.c             | 555 ++++++++++++++++++++------------
 drivers/base/property.c         |  52 +++
 drivers/base/swnode.c           |   2 +-
 drivers/firmware/efi/efi-init.c |  32 +-
 drivers/of/dynamic.c            |   1 +
 drivers/of/platform.c           |   2 -
 drivers/of/property.c           | 149 +++------
 include/linux/device.h          |  10 +-
 include/linux/fwnode.h          |  73 ++---
 include/linux/of.h              |   2 +-
 include/linux/property.h        |   3 +
 kernel/irq/irqdomain.c          |   2 +-
 14 files changed, 495 insertions(+), 392 deletions(-)

Comments

Tomi Valkeinen Nov. 24, 2020, 8:29 a.m. UTC | #1
Hi,

On 21/11/2020 04:02, Saravana Kannan wrote:
> The current implementation of fw_devlink is very inefficient because it
> tries to get away without creating fwnode links in the name of saving
> memory usage. Past attempts to optimize runtime at the cost of memory
> usage were blocked with request for data showing that the optimization
> made significant improvement for real world scenarios.
> 
> We have those scenarios now. There have been several reports of boot
> time increase in the order of seconds in this thread [1]. Several OEMs
> and SoC manufacturers have also privately reported significant
> (350-400ms) increase in boot time due to all the parsing done by
> fw_devlink.
> 
> So this patch series refactors fw_devlink to be more efficient. The key
> difference now is the addition of support for fwnode links -- just a few
> simple APIs. This also allows most of the code to be moved out of
> firmware specific (DT mostly) code into driver core.
> 
> This brings the following benefits:
> - Instead of parsing the device tree multiple times (complexity was
>   close to O(N^3) where N in the number of properties) during bootup,
>   fw_devlink parses each fwnode node/property only once and creates
>   fwnode links. The rest of the fw_devlink code then just looks at these
>   fwnode links to do rest of the work.
> 
> - Makes it much easier to debug probe issue due to fw_devlink in the
>   future. fw_devlink=on blocks the probing of devices if they depend on
>   a device that hasn't been added yet. With this refactor, it'll be very
>   easy to tell what that device is because we now have a reference to
>   the fwnode of the device.
> 
> - Much easier to add fw_devlink support to ACPI and other firmware
>   types. A refactor to move the common bits from DT specific code to
>   driver core was in my TODO list as a prerequisite to adding ACPI
>   support to fw_devlink. This series gets that done.
> 
> Laurent and Grygorii tested the v1 series and they saw boot time
> improvment of about 12 seconds and 3 seconds, respectively.

Tested v2 on OMAP4 SDP. With my particular config, boot time to starting init went from 18.5 seconds
to 12.5 seconds.

 Tomi
Saravana Kannan Nov. 24, 2020, 5:25 p.m. UTC | #2
On Tue, Nov 24, 2020 at 12:29 AM 'Tomi Valkeinen' via kernel-team
<kernel-team@android.com> wrote:
>
> Hi,
>
> On 21/11/2020 04:02, Saravana Kannan wrote:
> > The current implementation of fw_devlink is very inefficient because it
> > tries to get away without creating fwnode links in the name of saving
> > memory usage. Past attempts to optimize runtime at the cost of memory
> > usage were blocked with request for data showing that the optimization
> > made significant improvement for real world scenarios.
> >
> > We have those scenarios now. There have been several reports of boot
> > time increase in the order of seconds in this thread [1]. Several OEMs
> > and SoC manufacturers have also privately reported significant
> > (350-400ms) increase in boot time due to all the parsing done by
> > fw_devlink.
> >
> > So this patch series refactors fw_devlink to be more efficient. The key
> > difference now is the addition of support for fwnode links -- just a few
> > simple APIs. This also allows most of the code to be moved out of
> > firmware specific (DT mostly) code into driver core.
> >
> > This brings the following benefits:
> > - Instead of parsing the device tree multiple times (complexity was
> >   close to O(N^3) where N in the number of properties) during bootup,
> >   fw_devlink parses each fwnode node/property only once and creates
> >   fwnode links. The rest of the fw_devlink code then just looks at these
> >   fwnode links to do rest of the work.
> >
> > - Makes it much easier to debug probe issue due to fw_devlink in the
> >   future. fw_devlink=on blocks the probing of devices if they depend on
> >   a device that hasn't been added yet. With this refactor, it'll be very
> >   easy to tell what that device is because we now have a reference to
> >   the fwnode of the device.
> >
> > - Much easier to add fw_devlink support to ACPI and other firmware
> >   types. A refactor to move the common bits from DT specific code to
> >   driver core was in my TODO list as a prerequisite to adding ACPI
> >   support to fw_devlink. This series gets that done.
> >
> > Laurent and Grygorii tested the v1 series and they saw boot time
> > improvment of about 12 seconds and 3 seconds, respectively.
>
> Tested v2 on OMAP4 SDP. With my particular config, boot time to starting init went from 18.5 seconds
> to 12.5 seconds.

Thanks for testing Tomi!

-Saravana
Saravana Kannan Dec. 3, 2020, 7:05 p.m. UTC | #3
On Tue, Nov 24, 2020 at 12:29 AM 'Tomi Valkeinen' via kernel-team
<kernel-team@android.com> wrote:
>
> Hi,
>
> On 21/11/2020 04:02, Saravana Kannan wrote:
> > The current implementation of fw_devlink is very inefficient because it
> > tries to get away without creating fwnode links in the name of saving
> > memory usage. Past attempts to optimize runtime at the cost of memory
> > usage were blocked with request for data showing that the optimization
> > made significant improvement for real world scenarios.
> >
> > We have those scenarios now. There have been several reports of boot
> > time increase in the order of seconds in this thread [1]. Several OEMs
> > and SoC manufacturers have also privately reported significant
> > (350-400ms) increase in boot time due to all the parsing done by
> > fw_devlink.
> >
> > So this patch series refactors fw_devlink to be more efficient. The key
> > difference now is the addition of support for fwnode links -- just a few
> > simple APIs. This also allows most of the code to be moved out of
> > firmware specific (DT mostly) code into driver core.
> >
> > This brings the following benefits:
> > - Instead of parsing the device tree multiple times (complexity was
> >   close to O(N^3) where N in the number of properties) during bootup,
> >   fw_devlink parses each fwnode node/property only once and creates
> >   fwnode links. The rest of the fw_devlink code then just looks at these
> >   fwnode links to do rest of the work.
> >
> > - Makes it much easier to debug probe issue due to fw_devlink in the
> >   future. fw_devlink=on blocks the probing of devices if they depend on
> >   a device that hasn't been added yet. With this refactor, it'll be very
> >   easy to tell what that device is because we now have a reference to
> >   the fwnode of the device.
> >
> > - Much easier to add fw_devlink support to ACPI and other firmware
> >   types. A refactor to move the common bits from DT specific code to
> >   driver core was in my TODO list as a prerequisite to adding ACPI
> >   support to fw_devlink. This series gets that done.
> >
> > Laurent and Grygorii tested the v1 series and they saw boot time
> > improvment of about 12 seconds and 3 seconds, respectively.
>
> Tested v2 on OMAP4 SDP. With my particular config, boot time to starting init went from 18.5 seconds
> to 12.5 seconds.
>
>  Tomi

Rafael,

Friendly reminder for a review.

-Saravana
Greg Kroah-Hartman Dec. 9, 2020, 6:16 p.m. UTC | #4
On Fri, Nov 20, 2020 at 06:02:15PM -0800, Saravana Kannan wrote:
> The current implementation of fw_devlink is very inefficient because it
> tries to get away without creating fwnode links in the name of saving
> memory usage. Past attempts to optimize runtime at the cost of memory
> usage were blocked with request for data showing that the optimization
> made significant improvement for real world scenarios.
> 
> We have those scenarios now. There have been several reports of boot
> time increase in the order of seconds in this thread [1]. Several OEMs
> and SoC manufacturers have also privately reported significant
> (350-400ms) increase in boot time due to all the parsing done by
> fw_devlink.
> 
> So this patch series refactors fw_devlink to be more efficient. The key
> difference now is the addition of support for fwnode links -- just a few
> simple APIs. This also allows most of the code to be moved out of
> firmware specific (DT mostly) code into driver core.
> 
> This brings the following benefits:
> - Instead of parsing the device tree multiple times (complexity was
>   close to O(N^3) where N in the number of properties) during bootup,
>   fw_devlink parses each fwnode node/property only once and creates
>   fwnode links. The rest of the fw_devlink code then just looks at these
>   fwnode links to do rest of the work.
> 
> - Makes it much easier to debug probe issue due to fw_devlink in the
>   future. fw_devlink=on blocks the probing of devices if they depend on
>   a device that hasn't been added yet. With this refactor, it'll be very
>   easy to tell what that device is because we now have a reference to
>   the fwnode of the device.
> 
> - Much easier to add fw_devlink support to ACPI and other firmware
>   types. A refactor to move the common bits from DT specific code to
>   driver core was in my TODO list as a prerequisite to adding ACPI
>   support to fw_devlink. This series gets that done.
> 
> Laurent and Grygorii tested the v1 series and they saw boot time
> improvment of about 12 seconds and 3 seconds, respectively.

Now queued up to my tree.  Note, I had to hand-apply patches 13 and 16
due to some reason (for 13, I have no idea, for 16 it was due to a
previous patch applied to my tree that I cc:ed you on.)

Verifying I got it all correct would be great :)

thanks,

greg k-h
Saravana Kannan Dec. 9, 2020, 8:24 p.m. UTC | #5
On Wed, Dec 9, 2020 at 10:15 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Fri, Nov 20, 2020 at 06:02:15PM -0800, Saravana Kannan wrote:
> > The current implementation of fw_devlink is very inefficient because it
> > tries to get away without creating fwnode links in the name of saving
> > memory usage. Past attempts to optimize runtime at the cost of memory
> > usage were blocked with request for data showing that the optimization
> > made significant improvement for real world scenarios.
> >
> > We have those scenarios now. There have been several reports of boot
> > time increase in the order of seconds in this thread [1]. Several OEMs
> > and SoC manufacturers have also privately reported significant
> > (350-400ms) increase in boot time due to all the parsing done by
> > fw_devlink.
> >
> > So this patch series refactors fw_devlink to be more efficient. The key
> > difference now is the addition of support for fwnode links -- just a few
> > simple APIs. This also allows most of the code to be moved out of
> > firmware specific (DT mostly) code into driver core.
> >
> > This brings the following benefits:
> > - Instead of parsing the device tree multiple times (complexity was
> >   close to O(N^3) where N in the number of properties) during bootup,
> >   fw_devlink parses each fwnode node/property only once and creates
> >   fwnode links. The rest of the fw_devlink code then just looks at these
> >   fwnode links to do rest of the work.
> >
> > - Makes it much easier to debug probe issue due to fw_devlink in the
> >   future. fw_devlink=on blocks the probing of devices if they depend on
> >   a device that hasn't been added yet. With this refactor, it'll be very
> >   easy to tell what that device is because we now have a reference to
> >   the fwnode of the device.
> >
> > - Much easier to add fw_devlink support to ACPI and other firmware
> >   types. A refactor to move the common bits from DT specific code to
> >   driver core was in my TODO list as a prerequisite to adding ACPI
> >   support to fw_devlink. This series gets that done.
> >
> > Laurent and Grygorii tested the v1 series and they saw boot time
> > improvment of about 12 seconds and 3 seconds, respectively.
>
> Now queued up to my tree.  Note, I had to hand-apply patches 13 and 16
> due to some reason (for 13, I have no idea, for 16 it was due to a
> previous patch applied to my tree that I cc:ed you on.)
>
> Verifying I got it all correct would be great :)

A quick diff of drivers/base/core.c between driver-core-testing and my
local tree doesn't show any major diff (only some unrelated comment
fixes). So, it looks fine.

The patch 13 conflict is probably due to having to rebase the v2
series on top of this:
https://lore.kernel.org/lkml/20201104205431.3795207-1-saravanak@google.com/

And looks like Patch 16 was handled fine.

Thanks for applying the series.

-Saravana
Greg Kroah-Hartman Dec. 10, 2020, 9:26 a.m. UTC | #6
On Wed, Dec 09, 2020 at 12:24:32PM -0800, Saravana Kannan wrote:
> On Wed, Dec 9, 2020 at 10:15 AM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Fri, Nov 20, 2020 at 06:02:15PM -0800, Saravana Kannan wrote:
> > > The current implementation of fw_devlink is very inefficient because it
> > > tries to get away without creating fwnode links in the name of saving
> > > memory usage. Past attempts to optimize runtime at the cost of memory
> > > usage were blocked with request for data showing that the optimization
> > > made significant improvement for real world scenarios.
> > >
> > > We have those scenarios now. There have been several reports of boot
> > > time increase in the order of seconds in this thread [1]. Several OEMs
> > > and SoC manufacturers have also privately reported significant
> > > (350-400ms) increase in boot time due to all the parsing done by
> > > fw_devlink.
> > >
> > > So this patch series refactors fw_devlink to be more efficient. The key
> > > difference now is the addition of support for fwnode links -- just a few
> > > simple APIs. This also allows most of the code to be moved out of
> > > firmware specific (DT mostly) code into driver core.
> > >
> > > This brings the following benefits:
> > > - Instead of parsing the device tree multiple times (complexity was
> > >   close to O(N^3) where N in the number of properties) during bootup,
> > >   fw_devlink parses each fwnode node/property only once and creates
> > >   fwnode links. The rest of the fw_devlink code then just looks at these
> > >   fwnode links to do rest of the work.
> > >
> > > - Makes it much easier to debug probe issue due to fw_devlink in the
> > >   future. fw_devlink=on blocks the probing of devices if they depend on
> > >   a device that hasn't been added yet. With this refactor, it'll be very
> > >   easy to tell what that device is because we now have a reference to
> > >   the fwnode of the device.
> > >
> > > - Much easier to add fw_devlink support to ACPI and other firmware
> > >   types. A refactor to move the common bits from DT specific code to
> > >   driver core was in my TODO list as a prerequisite to adding ACPI
> > >   support to fw_devlink. This series gets that done.
> > >
> > > Laurent and Grygorii tested the v1 series and they saw boot time
> > > improvment of about 12 seconds and 3 seconds, respectively.
> >
> > Now queued up to my tree.  Note, I had to hand-apply patches 13 and 16
> > due to some reason (for 13, I have no idea, for 16 it was due to a
> > previous patch applied to my tree that I cc:ed you on.)
> >
> > Verifying I got it all correct would be great :)
> 
> A quick diff of drivers/base/core.c between driver-core-testing and my
> local tree doesn't show any major diff (only some unrelated comment
> fixes). So, it looks fine.
> 
> The patch 13 conflict is probably due to having to rebase the v2
> series on top of this:
> https://lore.kernel.org/lkml/20201104205431.3795207-1-saravanak@google.com/
> 
> And looks like Patch 16 was handled fine.

Great, thanks for verifying!

greg k-h