mbox series

[driver-core,v9,0/9] Add NUMA aware async_schedule calls

Message ID 154466182249.9126.3905559325944768059.stgit@ahduyck-desk1.jf.intel.com (mailing list archive)
Headers show
Series Add NUMA aware async_schedule calls | expand


Alexander Duyck Dec. 13, 2018, 12:44 a.m. UTC
This patch set provides functionality that will help to improve the
locality of the async_schedule calls used to provide deferred

This patch set originally started out focused on just the one call to
async_schedule_domain in the nvdimm tree that was being used to defer the
device_add call however after doing some digging I realized the scope of
this was much broader than I had originally planned. As such I went
through and reworked the underlying infrastructure down to replacing the
queue_work call itself with a function of my own and opted to try and
provide a NUMA aware solution that would work for a broader audience.

In addition I have added several tweaks and/or clean-ups to the front of the
patch set. Patches 1 through 3 address a number of issues that actually were
causing the existing async_schedule calls to not show the performance that
they could due to either not scaling on a per device basis, or due to issues
that could result in a potential race. For example, patch 3 addresses the
fact that we were calling async_schedule once per driver instead of once
per device, and as a result we would have still ended up with devices
being probed on a non-local node without addressing this first.

I have also updated the kernel module used to test async driver probing so
that it can expose the original issue I was attempting to address.
It will fail on a system of asynchronous work either takes longer than it
takes to load a single device and a single driver with a device already
added. It will also fail if the NUMA node that the driver is loaded on does
not match the NUMA node the device is associated with.

    Dropped nvdimm patch to submit later.
        It relies on code in libnvdimm development tree.
    Simplified queue_work_near to just convert node into a CPU.
    Split up drivers core and PM core patches.
    Renamed queue_work_near to queue_work_node
    Added WARN_ON_ONCE if we use queue_work_node with per-cpu workqueue
    Added Acked-by for queue_work_node patch
    Continued rename from _near to _node to be consistent with queue_work_node
        Renamed async_schedule_near_domain to async_schedule_node_domain
        Renamed async_schedule_near to async_schedule_node
    Added kerneldoc for new async_schedule_XXX functions
    Updated patch description for patch 4 to include data on potential gains
    Added patch to consolidate use of need_parent_lock
    Make asynchronous driver probing explicit about use of drvdata
    Added patch to move async_synchronize_full to address deadlock
    Added bit async_probe to act as mutex for probe/remove calls
    Added back nvdimm patch as code it relies on is now in Linus's tree
    Incorporated review comments on parent & device locking consolidation
    Rebased on latest linux-next
    Drop the "This patch" or "This change" from start of patch descriptions.
    Drop unnecessary parenthesis in first patch
    Use same wording for "selecting a CPU" in comments added in first patch
    Added kernel documentation for async_probe member of device
    Fixed up comments for async_schedule calls in patch 2
    Moved code related setting async driver out of device.h and into dd.c
    Added Reviewed-by for several patches
    Fixed typo which had kernel doc refer to "lock" when I meant "unlock"
    Dropped "bool X:1" to "u8 X:1" from patch description
    Added async_driver to device_private structure to store driver
    Dropped unecessary code shuffle from async_probe patch
    Reordered patches to move fixes up to front
    Added Reviewed-by for several patches
    Updated cover page and patch descriptions throughout the set
    Replaced async_probe value with dead, only apply dead in device_del
    Dropped Reviewed-by from patch 2 due to significant changes
    Added Reviewed-by for patches reviewed by Luis Chamberlain
    Dropped patch 1 as it was applied, shifted remaining patches by 1
    Added new patch 9 that adds test framework for NUMA and sequential init
    Tweaked what is now patch 1, and added Reviewed-by from Dan Williams


Alexander Duyck (9):
      driver core: Establish order of operations for device_add and device_del via bitflag
      device core: Consolidate locking and unlocking of parent and device
      driver core: Probe devices asynchronously instead of the driver
      workqueue: Provide queue_work_node to queue work near a given NUMA node
      async: Add support for queueing on specific NUMA node
      driver core: Attach devices on CPU local to device node
      PM core: Use new async_schedule_dev command
      libnvdimm: Schedule device registration on node local to the device
      driver core: Rewrite test_async_driver_probe to cover serialization and NUMA affinity

 drivers/base/base.h                         |    4 
 drivers/base/bus.c                          |   46 +----
 drivers/base/core.c                         |   11 +
 drivers/base/dd.c                           |  160 +++++++++++++----
 drivers/base/power/main.c                   |   12 +
 drivers/base/test/test_async_driver_probe.c |  261 +++++++++++++++++++++------
 drivers/nvdimm/bus.c                        |   11 +
 include/linux/async.h                       |   82 ++++++++
 include/linux/device.h                      |    5 +
 include/linux/workqueue.h                   |    2 
 kernel/async.c                              |   53 +++--
 kernel/workqueue.c                          |   84 +++++++++
 12 files changed, 565 insertions(+), 166 deletions(-)