Message ID | 154403054034.11544.3978949383914046587.stgit@ahduyck-desk1.jf.intel.com (mailing list archive) |
---|---|
Headers | show |
Series | Add NUMA aware async_schedule calls | expand |
On Wed, Dec 05, 2018 at 09:25:13AM -0800, Alexander Duyck wrote: > This patch set provides functionality that will help to improve the > locality of the async_schedule calls used to provide deferred > initialization. > > This patch set originally started out focused on just the one call to > async_schedule_domain in the nvdimm tree that was being used to defer the > device_add call however after doing some digging I realized the scope of > this was much broader than I had originally planned. As such I went > through and reworked the underlying infrastructure down to replacing the > queue_work call itself with a function of my own and opted to try and > provide a NUMA aware solution that would work for a broader audience. > > In addition I have added several tweaks and/or clean-ups to the front of the > patch set. Patches 1 through 4 address a number of issues that actually were > causing the existing async_schedule calls to not show the performance that > they could due to either not scaling on a per device basis, or due to issues > that could result in a potential deadlock. For example, patch 4 addresses the > fact that we were calling async_schedule once per driver instead of once > per device, and as a result we would have still ended up with devices > being probed on a non-local node without addressing this first. No tests were added. Again, I think it would be good to add test cases to showcase the old mechanisms, illustrate the new, and ensure we don't regress both now and also help us ensure we don't regress moving forward. This is all too critical of a path for the kernel, and these changes are rather instrusive. I'd readlly like to see test code for it now rather than later. Luis
On Mon, 2018-12-10 at 11:22 -0800, Luis Chamberlain wrote: > On Wed, Dec 05, 2018 at 09:25:13AM -0800, Alexander Duyck wrote: > > This patch set provides functionality that will help to improve the > > locality of the async_schedule calls used to provide deferred > > initialization. > > > > This patch set originally started out focused on just the one call to > > async_schedule_domain in the nvdimm tree that was being used to defer the > > device_add call however after doing some digging I realized the scope of > > this was much broader than I had originally planned. As such I went > > through and reworked the underlying infrastructure down to replacing the > > queue_work call itself with a function of my own and opted to try and > > provide a NUMA aware solution that would work for a broader audience. > > > > In addition I have added several tweaks and/or clean-ups to the front of the > > patch set. Patches 1 through 4 address a number of issues that actually were > > causing the existing async_schedule calls to not show the performance that > > they could due to either not scaling on a per device basis, or due to issues > > that could result in a potential deadlock. For example, patch 4 addresses the > > fact that we were calling async_schedule once per driver instead of once > > per device, and as a result we would have still ended up with devices > > being probed on a non-local node without addressing this first. > > No tests were added. Again, I think it would be good to add test > cases to showcase the old mechanisms, illustrate the new, and ensure > we don't regress both now and also help us ensure we don't regress > moving forward. > > This is all too critical of a path for the kernel, and these changes > are rather instrusive. I'd readlly like to see test code for it now > rather than later. > > Luis Sorry about that. I was more focused on the rewrite of patch 2 and overlooked the comment about lib/test_kmod.c. I'll look into it and see if I can squeeze it in for v9. Thanks. - Alex
On Mon, Dec 10, 2018 at 03:25:04PM -0800, Alexander Duyck wrote: > On Mon, 2018-12-10 at 11:22 -0800, Luis Chamberlain wrote: > > On Wed, Dec 05, 2018 at 09:25:13AM -0800, Alexander Duyck wrote: > > > This patch set provides functionality that will help to improve the > > > locality of the async_schedule calls used to provide deferred > > > initialization. > > > > > > This patch set originally started out focused on just the one call to > > > async_schedule_domain in the nvdimm tree that was being used to defer the > > > device_add call however after doing some digging I realized the scope of > > > this was much broader than I had originally planned. As such I went > > > through and reworked the underlying infrastructure down to replacing the > > > queue_work call itself with a function of my own and opted to try and > > > provide a NUMA aware solution that would work for a broader audience. > > > > > > In addition I have added several tweaks and/or clean-ups to the front of the > > > patch set. Patches 1 through 4 address a number of issues that actually were > > > causing the existing async_schedule calls to not show the performance that > > > they could due to either not scaling on a per device basis, or due to issues > > > that could result in a potential deadlock. For example, patch 4 addresses the > > > fact that we were calling async_schedule once per driver instead of once > > > per device, and as a result we would have still ended up with devices > > > being probed on a non-local node without addressing this first. > > > > No tests were added. Again, I think it would be good to add test > > cases to showcase the old mechanisms, illustrate the new, and ensure > > we don't regress both now and also help us ensure we don't regress > > moving forward. > > > > This is all too critical of a path for the kernel, and these changes > > are rather instrusive. I'd readlly like to see test code for it now > > rather than later. > > > > Luis > > Sorry about that. I was more focused on the rewrite of patch 2 and > overlooked the comment about lib/test_kmod.c. > > I'll look into it and see if I can squeeze it in for v9. Superb! Luis