mbox series

[v8,0/7] mm: Merge hmm into devm_memremap_pages, mark GPL-only

Message ID 154275556908.76910.8966087090637564219.stgit@dwillia2-desk3.amr.corp.intel.com (mailing list archive)
Headers show
Series mm: Merge hmm into devm_memremap_pages, mark GPL-only | expand

Message

Dan Williams Nov. 20, 2018, 11:12 p.m. UTC
Changes since v7 [1]:
* Rebase on next-20181119

[1]: https://lkml.org/lkml/2018/10/12/878

---

At Maintainer Summit, Greg brought up a topic I proposed around
EXPORT_SYMBOL_GPL usage. The motivation was considerations for when
EXPORT_SYMBOL_GPL is warranted and the criteria for taking the
exceptional step of reclassifying an existing export. Specifically, I
wanted to make the case that although the line is fuzzy and hard to
specify in abstract terms, it is nonetheless clear that
devm_memremap_pages() and HMM (Heterogeneous Memory Management) have
crossed it. The devm_memremap_pages() facility should have been
EXPORT_SYMBOL_GPL from the beginning, and HMM as a derivative of that
functionality should have naturally picked up that designation as well.

Contrary to typical rules, the HMM infrastructure was merged upstream
with zero in-tree consumers. There was a promise at the time that those
users would be merged "soon", but it has been over a year with no drivers
arriving. While the Nouveau driver is about to belatedly make good on
that promise it is clear that HMM was targeted first and foremost at an
out-of-tree consumer.

HMM is derived from devm_memremap_pages(), a facility Christoph and I
spearheaded to support persistent memory. It combines a device lifetime
model with a dynamically created 'struct page' / memmap array for any
physical address range. It enables coordination and control of the many
code paths in the kernel built to interact with memory via 'struct page'
objects. With HMM the integration goes even deeper by allowing device
drivers to hook and manipulate page fault and page free events.

One interpretation of when EXPORT_SYMBOL is suitable is when it is
exporting stable and generic leaf functionality.  The
devm_memremap_pages() facility continues to see expanding use cases,
peer-to-peer DMA being the most recent, with no clear end date when it
will stop attracting reworks and semantic changes. It is not suitable to
export devm_memremap_pages() as a stable 3rd party driver API due to the
fact that it is still changing and manipulates core behavior. Moreover,
it is not in the best interest of the long term development of the core
memory management subsystem to permit any external driver to effectively
define its own system-wide memory management policies with no
encouragement to engage with upstream.

I am also concerned that HMM was designed in a way to minimize further
engagement with the core-MM. That, with these hooks in place,
device-drivers are free to implement their own policies without much
consideration for whether and how the core-MM could grow to meet that
need. Going forward not only should HMM be EXPORT_SYMBOL_GPL, but the
core-MM should be allowed the opportunity and stimulus to change and
address these new use cases as first class functionality.

There is some more detailed justification in the individual changelogs.
The 0day infrastructure has reported build success on 102 configs and
this survives the libnvdimm unit test suite. Setting aside the
controversial aspect, the diffstat is compelling at:

	7 files changed, 126 insertions(+), 323 deletions(-)

---

Dan Williams (7):
      mm, devm_memremap_pages: Mark devm_memremap_pages() EXPORT_SYMBOL_GPL
      mm, devm_memremap_pages: Kill mapping "System RAM" support
      mm, devm_memremap_pages: Fix shutdown handling
      mm, devm_memremap_pages: Add MEMORY_DEVICE_PRIVATE support
      mm, hmm: Use devm semantics for hmm_devmem_{add,remove}
      mm, hmm: Replace hmm_devmem_pages_create() with devm_memremap_pages()
      mm, hmm: Mark hmm_devmem_{add,add_resource} EXPORT_SYMBOL_GPL


 drivers/dax/pmem.c                |   14 --
 drivers/nvdimm/pmem.c             |   13 +-
 include/linux/hmm.h               |    4 
 include/linux/memremap.h          |    2 
 kernel/memremap.c                 |   94 +++++++----
 mm/hmm.c                          |  305 +++++--------------------------------
 tools/testing/nvdimm/test/iomap.c |   17 ++
 7 files changed, 126 insertions(+), 323 deletions(-)

Comments

Andrew Morton Nov. 22, 2018, 1:20 a.m. UTC | #1
On Tue, 20 Nov 2018 15:12:49 -0800 Dan Williams <dan.j.williams@intel.com> wrote:

> Changes since v7 [1]:
> At Maintainer Summit, Greg brought up a topic I proposed around
> EXPORT_SYMBOL_GPL usage. The motivation was considerations for when
> EXPORT_SYMBOL_GPL is warranted and the criteria for taking the
> exceptional step of reclassifying an existing export. Specifically, I
> wanted to make the case that although the line is fuzzy and hard to
> specify in abstract terms, it is nonetheless clear that
> devm_memremap_pages() and HMM (Heterogeneous Memory Management) have
> crossed it. The devm_memremap_pages() facility should have been
> EXPORT_SYMBOL_GPL from the beginning, and HMM as a derivative of that
> functionality should have naturally picked up that designation as well.
> 
> Contrary to typical rules, the HMM infrastructure was merged upstream
> with zero in-tree consumers. There was a promise at the time that those
> users would be merged "soon", but it has been over a year with no drivers
> arriving. While the Nouveau driver is about to belatedly make good on
> that promise it is clear that HMM was targeted first and foremost at an
> out-of-tree consumer.
> 
> HMM is derived from devm_memremap_pages(), a facility Christoph and I
> spearheaded to support persistent memory. It combines a device lifetime
> model with a dynamically created 'struct page' / memmap array for any
> physical address range. It enables coordination and control of the many
> code paths in the kernel built to interact with memory via 'struct page'
> objects. With HMM the integration goes even deeper by allowing device
> drivers to hook and manipulate page fault and page free events.
> 
> One interpretation of when EXPORT_SYMBOL is suitable is when it is
> exporting stable and generic leaf functionality.  The
> devm_memremap_pages() facility continues to see expanding use cases,
> peer-to-peer DMA being the most recent, with no clear end date when it
> will stop attracting reworks and semantic changes. It is not suitable to
> export devm_memremap_pages() as a stable 3rd party driver API due to the
> fact that it is still changing and manipulates core behavior. Moreover,
> it is not in the best interest of the long term development of the core
> memory management subsystem to permit any external driver to effectively
> define its own system-wide memory management policies with no
> encouragement to engage with upstream.
> 
> I am also concerned that HMM was designed in a way to minimize further
> engagement with the core-MM. That, with these hooks in place,
> device-drivers are free to implement their own policies without much
> consideration for whether and how the core-MM could grow to meet that
> need. Going forward not only should HMM be EXPORT_SYMBOL_GPL, but the
> core-MM should be allowed the opportunity and stimulus to change and
> address these new use cases as first class functionality.
> 

The arguments are compelling.  I apologize for not thinking of and/or
not being made aware of them at the time.

I'll take [7/7] (with all the above added to the changelog) with a view
to a 4.21-rc1 merge.  That gives us a couple of months for further
discussion.  Public discussion, please.

It should be noted that [7/7] has a cc:stable.
Jerome Glisse Dec. 3, 2018, 11:37 p.m. UTC | #2
On Wed, Nov 21, 2018 at 05:20:55PM -0800, Andrew Morton wrote:
> On Tue, 20 Nov 2018 15:12:49 -0800 Dan Williams <dan.j.williams@intel.com> wrote:

[...]

> > I am also concerned that HMM was designed in a way to minimize further
> > engagement with the core-MM. That, with these hooks in place,
> > device-drivers are free to implement their own policies without much
> > consideration for whether and how the core-MM could grow to meet that
> > need. Going forward not only should HMM be EXPORT_SYMBOL_GPL, but the
> > core-MM should be allowed the opportunity and stimulus to change and
> > address these new use cases as first class functionality.
> > 
> 
> The arguments are compelling.  I apologize for not thinking of and/or
> not being made aware of them at the time.

So i wanted to comment on that part. Yes HMM is an impedence layer
that goes both way ie device driver are shielded from core mm and
core mm folks do not need to understand individual driver to modify
mm, they only need to understand what is provided to the driver by
HMM (and keeps HMM promise intact from driver POV no matter how it
is achieve). So this is intentional.

Nonetheless I want to grow core mm involvement in managing those
memory (see patchset i just posted about hbind() and heterogeneous
memory system). But i do not expect that core mm will be in full
control at least not for some time. The historical reasons is that
device like GPU are not only use for compute (which is where HMM
gets use) but also for graphics (simple desktop or even games).
Those are two differents workload using different API (CUDA/OpenCL
for compute, OpenGL/Vulkan for graphics) on the same underlying
hardware.

Those API expose the hardware in incompatible way when it comes to
memory management (especialy API like Vulkan). Managing memory page
wise is not well suited for graphics. The issues comes from the
fact that we do not want to exclude either workload from happening
concurrently (running your destkop while some compute job is running
in the background). So for this to work we need to keep the device
driver in control of its memory (hence why callback when page are
freed for instance). We also need to forbid things like pinning any
device memory pages ...


I still expect some commonality to emerge accross different hardware
so that we can grow more things and share more code into core mm but
i want to get their organicaly, not forcing everyone into a design
today. I expect this will happens by going from high level concept,
how things get use in userspace from end user POV, and working back-
ward from there to see what common API (if any) we can provided to
catter those common use case.

Cheers,
Jérôme