mbox series

[GIT,PULL] Please pull hmm changes

Message ID 20191125204248.GA2485@ziepe.ca (mailing list archive)
State New, archived
Headers show
Series [GIT,PULL] Please pull hmm changes | expand

Pull-request

git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git tags/for-linus-hmm

Message

Jason Gunthorpe Nov. 25, 2019, 8:42 p.m. UTC
Hi Linus,

Here is this batch of hmm updates, I think we are nearing the end of this
project for now, although I suspect there will be some more patches related to
hmm_range_fault() in the next cycle.

You will probably be most interested in the patch "mm/mmu_notifier: add an
interval tree notifier". The approach here largely pre-exists in the various
drivers, but is honestly kind of complex/ugly. No better idea was found, I'm
hoping putting it all in one place will help improve this over the long
term. At least many bugs were squashed and lines of code eliminated while
consolidating.

Already i915 GPU has posted a series for the next window that also needs this
same approach.

There are two small conflicts I know of, the first is RDMA related with -rc,
the second is a one liner updating a deleted comment in GPU. Both can be
solved by using the hmm.git side of the conflict.

All the big driver changes have been acked and/or tested by their respective
maintainers.

Regards,
Jason

The following changes since commit d6d5df1db6e9d7f8f76d2911707f7d5877251b02:

  Linux 5.4-rc5 (2019-10-27 13:19:19 -0400)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git tags/for-linus-hmm

for you to fetch changes up to 93f4e735b6d98ee4b7a1252d81e815a983e359f2:

  mm/hmm: remove hmm_range_dma_map and hmm_range_dma_unmap (2019-11-23 19:56:45 -0400)

----------------------------------------------------------------
hmm related patches for 5.5

This is another round of bug fixing and cleanup. This time the focus is on
the driver pattern to use mmu notifiers to monitor a VA range. This code
is lifted out of many drivers and hmm_mirror directly into the
mmu_notifier core and written using the best ideas from all the driver
implementations.

This removes many bugs from the drivers and has a very pleasing
diffstat. More drivers can still be converted, but that is for another
cycle.

- A shared branch with RDMA reworking the RDMA ODP implementation

- New mmu_interval_notifier API. This is focused on the use case of
  monitoring a VA and simplifies the process for drivers

- A common seq-count locking scheme built into the mmu_interval_notifier
  API usable by drivers that call get_user_pages() or hmm_range_fault()
  with the VA range

- Conversion of mlx5 ODP, hfi1, radeon, nouveau, AMD GPU, and Xen GntDev
  drivers to the new API. This deletes a lot of wonky driver code.

- Two improvements for hmm_range_fault(), from testing done by Ralph

----------------------------------------------------------------
Christoph Hellwig (1):
      mm/hmm: remove hmm_range_dma_map and hmm_range_dma_unmap

Jason Gunthorpe (30):
      RDMA/mlx5: Use SRCU properly in ODP prefetch
      RDMA/mlx5: Split sig_err MR data into its own xarray
      RDMA/mlx5: Use a dedicated mkey xarray for ODP
      RDMA/mlx5: Delete struct mlx5_priv->mkey_table
      RDMA/mlx5: Rework implicit_mr_get_data
      RDMA/mlx5: Lift implicit_mr_alloc() into the two routines that call it
      RDMA/mlx5: Set the HW IOVA of the child MRs to their place in the tree
      RDMA/mlx5: Split implicit handling from pagefault_mr
      RDMA/mlx5: Use an xarray for the children of an implicit ODP
      RDMA/mlx5: Reduce locking in implicit_mr_get_data()
      RDMA/mlx5: Avoid double lookups on the pagefault path
      RDMA/mlx5: Rework implicit ODP destroy
      RDMA/mlx5: Do not store implicit children in the odp_mkeys xarray
      RDMA/mlx5: Do not race with mlx5_ib_invalidate_range during create and destroy
      RDMA/odp: Remove broken debugging call to invalidate_range
      Merge branch 'odp_rework' into hmm.git
      mm/mmu_notifier: define the header pre-processor parts even if disabled
      mm/mmu_notifier: add an interval tree notifier
      mm/hmm: allow hmm_range to be used with a mmu_interval_notifier or hmm_mirror
      mm/hmm: define the pre-processor related parts of hmm.h even if disabled
      RDMA/odp: Use mmu_interval_notifier_insert()
      RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv
      drm/radeon: use mmu_interval_notifier_insert
      nouveau: use mmu_notifier directly for invalidate_range_start
      nouveau: use mmu_interval_notifier instead of hmm_mirror
      drm/amdgpu: Call find_vma under mmap_sem
      drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror
      drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirror
      mm/hmm: remove hmm_mirror and related
      xen/gntdev: use mmu_interval_notifier_insert

Ralph Campbell (2):
      mm/hmm: allow snapshot of the special zero page
      mm/hmm: make full use of walk_page_range()

 Documentation/vm/hmm.rst                         |  105 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu.h              |    2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c |    9 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c           |   14 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c       |    1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c           |  443 ++--------
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h           |   53 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h       |   13 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c          |  145 ++--
 drivers/gpu/drm/nouveau/nouveau_svm.c            |  230 +++--
 drivers/gpu/drm/radeon/radeon.h                  |    9 +-
 drivers/gpu/drm/radeon/radeon_mn.c               |  218 +----
 drivers/infiniband/core/device.c                 |    1 -
 drivers/infiniband/core/umem_odp.c               |  341 ++------
 drivers/infiniband/hw/hfi1/file_ops.c            |    2 +-
 drivers/infiniband/hw/hfi1/hfi.h                 |    2 +-
 drivers/infiniband/hw/hfi1/user_exp_rcv.c        |  146 ++--
 drivers/infiniband/hw/hfi1/user_exp_rcv.h        |    3 +-
 drivers/infiniband/hw/mlx5/cq.c                  |   33 +-
 drivers/infiniband/hw/mlx5/devx.c                |    8 +-
 drivers/infiniband/hw/mlx5/main.c                |   17 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h             |   29 +-
 drivers/infiniband/hw/mlx5/mr.c                  |  142 ++-
 drivers/infiniband/hw/mlx5/odp.c                 | 1004 +++++++++++-----------
 drivers/net/ethernet/mellanox/mlx5/core/main.c   |    4 -
 drivers/net/ethernet/mellanox/mlx5/core/mr.c     |   28 +-
 drivers/xen/gntdev-common.h                      |    8 +-
 drivers/xen/gntdev.c                             |  179 ++--
 include/linux/hmm.h                              |  190 +---
 include/linux/mlx5/driver.h                      |    4 -
 include/linux/mmu_notifier.h                     |  147 +++-
 include/rdma/ib_umem_odp.h                       |   86 +-
 include/rdma/ib_verbs.h                          |    2 -
 kernel/fork.c                                    |    1 -
 mm/Kconfig                                       |    2 +-
 mm/hmm.c                                         |  523 ++---------
 mm/mmu_notifier.c                                |  557 +++++++++++-
 37 files changed, 1912 insertions(+), 2789 deletions(-)

Comments

Linus Torvalds Nov. 30, 2019, 6:03 p.m. UTC | #1
On Mon, Nov 25, 2019 at 12:42 PM Jason Gunthorpe <jgg@mellanox.com> wrote:
>
> You will probably be most interested in the patch "mm/mmu_notifier: add an
> interval tree notifier".

I'm trying to read that patch, and I'm completely unable to by the
absolutely *horrid* variable names.

There are zero excuses for names like "mmn_mm". WTF?

I'll try to figure the code out, but my initial reaction was "yeah,
not in my VM".

                   Linus
Linus Torvalds Nov. 30, 2019, 6:23 p.m. UTC | #2
On Sat, Nov 30, 2019 at 10:03 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> I'll try to figure the code out, but my initial reaction was "yeah,
> not in my VM".

Why is it ok to sometimes do

    WRITE_ONCE(mni->invalidate_seq, cur_seq);

(to pair with the unlocked READ_ONCE), and sometimes then do

    mni->invalidate_seq = mmn_mm->invalidate_seq;

My initial guess was that latter is only done at initialization time,
but at least in one case it's done *after* the mni has been added to
the mmn_mm (oh, how I despise those names - I can only repeat: WTF?).

See __mmu_interval_notifier_insert() in the
mmn_mm->active_invalidate_ranges case.

I'm guessing that it doesn't matter, because when inserting the
notifier the sequence number is presumably not used until after the
insertion (and any use though mmn_mm is protected by the
mmn_mm->lock), but it still looks odd to me.

               Linus
Linus Torvalds Nov. 30, 2019, 6:35 p.m. UTC | #3
On Mon, Nov 25, 2019 at 12:42 PM Jason Gunthorpe <jgg@mellanox.com> wrote:
>
> Here is this batch of hmm updates, I think we are nearing the end of this
> project for now, although I suspect there will be some more patches related to
> hmm_range_fault() in the next cycle.

I've ended up pulling this, but I'm not entirely happy with the code.
You've already seen the comments on it in the earlier replies.

            Linus
Jason Gunthorpe Dec. 3, 2019, 2:42 a.m. UTC | #4
On Sat, Nov 30, 2019 at 10:23:31AM -0800, Linus Torvalds wrote:
> On Sat, Nov 30, 2019 at 10:03 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > I'll try to figure the code out, but my initial reaction was "yeah,
> > not in my VM".
> 
> Why is it ok to sometimes do
> 
>     WRITE_ONCE(mni->invalidate_seq, cur_seq);
> 
> (to pair with the unlocked READ_ONCE), and sometimes then do
> 
>     mni->invalidate_seq = mmn_mm->invalidate_seq;
> 
> My initial guess was that latter is only done at initialization time,
> but at least in one case it's done *after* the mni has been added to
> the mmn_mm (oh, how I despise those names - I can only repeat: WTF?).

Yes, the only occurrences are in the notifier_insert, under the
spinlock. The one case where it is out of the natural order was to
make the manipulation of seq a bit saner, but in all cases since the
spinlock is held there is no way for another thread to get the pointer
to the 'mmu_interval_notifier *' to do the unlocked read.

Regarding the ugly names.. Naming has been really hard here because
currently everything is a 'mmu notifier' and the natural abberviations
from there are crummy. Here is the basic summary:

struct mmu_notifier_mm (ie the mm->mmu_notifier_mm)
   -> mmn_mm
struct mm_struct 
   -> mm
struct mmu_notifier (ie the user subscription to the mm_struct)
   -> mn
struct mmu_interval_notifier (the other kind of user subscription)
   -> mni
struct mmu_notifier_range (ie the args to invalidate_range)
   -> range

I can send a patch to switch mmn_mm to mmu_notifier_mm, which is the
only pre-existing name for this value. But IIRC, it is a somewhat ugly
with long line wrapping. 'mni' is a pain, I have to reflect on that.
(honesly, I dislike mmu_notififer_mm quite a lot too)

I think it would be overall nicer with better names for the original
structs. Perhaps:

 mmn_* - MMU notifier prefix
 mmn_state <- struct mmu_notifier_mm
 mmn_subscription (mmn_sub) <- struct mmu_notifier
 mmn_range_subscription (mmn_range_sub) <- struct mmu_interval_notifier
 mmn_invalidate_desc <- struct mmu_notifier_range

At least this is how I describe them in my mind..  This is a lot of
churn, and spreads through many drivers. This is why I kept the names
as-is and we ended up with the also quite bad 'mmu_interval_notifier'

Maybe just switch mmu_notifier_mm for mmn_state and leave the drivers
alone?

Anyone on the CC list have advice?

Jason
Jerome Glisse Dec. 5, 2019, 4:03 p.m. UTC | #5
On Tue, Dec 03, 2019 at 02:42:12AM +0000, Jason Gunthorpe wrote:
> On Sat, Nov 30, 2019 at 10:23:31AM -0800, Linus Torvalds wrote:
> > On Sat, Nov 30, 2019 at 10:03 AM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> > >
> > > I'll try to figure the code out, but my initial reaction was "yeah,
> > > not in my VM".
> > 
> > Why is it ok to sometimes do
> > 
> >     WRITE_ONCE(mni->invalidate_seq, cur_seq);
> > 
> > (to pair with the unlocked READ_ONCE), and sometimes then do
> > 
> >     mni->invalidate_seq = mmn_mm->invalidate_seq;
> > 
> > My initial guess was that latter is only done at initialization time,
> > but at least in one case it's done *after* the mni has been added to
> > the mmn_mm (oh, how I despise those names - I can only repeat: WTF?).
> 
> Yes, the only occurrences are in the notifier_insert, under the
> spinlock. The one case where it is out of the natural order was to
> make the manipulation of seq a bit saner, but in all cases since the
> spinlock is held there is no way for another thread to get the pointer
> to the 'mmu_interval_notifier *' to do the unlocked read.
> 
> Regarding the ugly names.. Naming has been really hard here because
> currently everything is a 'mmu notifier' and the natural abberviations
> from there are crummy. Here is the basic summary:
> 
> struct mmu_notifier_mm (ie the mm->mmu_notifier_mm)
>    -> mmn_mm
> struct mm_struct 
>    -> mm
> struct mmu_notifier (ie the user subscription to the mm_struct)
>    -> mn
> struct mmu_interval_notifier (the other kind of user subscription)
>    -> mni

What about "interval" the context should already tell people
it is related to mmu notifier and thus a notifier. I would
just remove the notifier suffix, this would match the below
range.

> struct mmu_notifier_range (ie the args to invalidate_range)
>    -> range

Yeah range as context should tell you it is related to mmu
notifier.

> 
> I can send a patch to switch mmn_mm to mmu_notifier_mm, which is the
> only pre-existing name for this value. But IIRC, it is a somewhat ugly
> with long line wrapping. 'mni' is a pain, I have to reflect on that.
> (honesly, I dislike mmu_notififer_mm quite a lot too)
> 
> I think it would be overall nicer with better names for the original
> structs. Perhaps:
> 
>  mmn_* - MMU notifier prefix
>  mmn_state <- struct mmu_notifier_mm
>  mmn_subscription (mmn_sub) <- struct mmu_notifier
>  mmn_range_subscription (mmn_range_sub) <- struct mmu_interval_notifier
>  mmn_invalidate_desc <- struct mmu_notifier_range

This looks good.

> 
> At least this is how I describe them in my mind..  This is a lot of
> churn, and spreads through many drivers. This is why I kept the names
> as-is and we ended up with the also quite bad 'mmu_interval_notifier'
> 
> Maybe just switch mmu_notifier_mm for mmn_state and leave the drivers
> alone?
> 
> Anyone on the CC list have advice?

Maybe we can do a semantic patch to do convertion and then Linus
can easily apply the patch by just re-running the coccinelle.

Cheers,
Jérôme
John Hubbard Dec. 5, 2019, 11:03 p.m. UTC | #6
On 12/2/19 6:42 PM, Jason Gunthorpe wrote:
...
> Regarding the ugly names.. Naming has been really hard here because
> currently everything is a 'mmu notifier' and the natural abberviations
> from there are crummy. Here is the basic summary:
> 
> struct mmu_notifier_mm (ie the mm->mmu_notifier_mm)
>    -> mmn_mm
> struct mm_struct 
>    -> mm
> struct mmu_notifier (ie the user subscription to the mm_struct)
>    -> mn
> struct mmu_interval_notifier (the other kind of user subscription)
>    -> mni
> struct mmu_notifier_range (ie the args to invalidate_range)
>    -> range
> 
> I can send a patch to switch mmn_mm to mmu_notifier_mm, which is the
> only pre-existing name for this value. But IIRC, it is a somewhat ugly
> with long line wrapping. 'mni' is a pain, I have to reflect on that.
> (honesly, I dislike mmu_notififer_mm quite a lot too)
> 
> I think it would be overall nicer with better names for the original
> structs. Perhaps:
> 
>  mmn_* - MMU notifier prefix
>  mmn_state <- struct mmu_notifier_mm
>  mmn_subscription (mmn_sub) <- struct mmu_notifier
>  mmn_range_subscription (mmn_range_sub) <- struct mmu_interval_notifier
>  mmn_invalidate_desc <- struct mmu_notifier_range
> 
> At least this is how I describe them in my mind..  This is a lot of
> churn, and spreads through many drivers. This is why I kept the names
> as-is and we ended up with the also quite bad 'mmu_interval_notifier'
> 
> Maybe just switch mmu_notifier_mm for mmn_state and leave the drivers
> alone?
> 
> Anyone on the CC list have advice?
> 
> Jason

No advice, just a naming idea similar in spirit to Jerome's suggestion
(use a longer descriptive word, and don't try to capture the entire phrase):
use "notif" in place of the unloved "mmn". So partially, approximately like 
this:

notif_*                                    <- MMU notifier prefix
notif_state                                <- struct mmu_notifier_mm
notif_subscription (notif_sub)             <- struct mmu_notifier
notif_invalidate_desc                      <- struct mmu_notifier_range*
notif_range_subscription (notif_range_sub) <- struct mmu_interval_notifier



thanks,
Jason Gunthorpe Dec. 11, 2019, 10:47 p.m. UTC | #7
On Thu, Dec 05, 2019 at 03:03:56PM -0800, John Hubbard wrote:

> No advice, just a naming idea similar in spirit to Jerome's suggestion
> (use a longer descriptive word, and don't try to capture the entire phrase):
> use "notif" in place of the unloved "mmn". So partially, approximately like 
> this:
> 
> notif_*                                    <- MMU notifier prefix
> notif_state                                <- struct mmu_notifier_mm
> notif_subscription (notif_sub)             <- struct mmu_notifier
> notif_invalidate_desc                      <- struct mmu_notifier_range*
> notif_range_subscription (notif_range_sub) <- struct mmu_interval_notifier

To me 'notif' suggests this belongs to the stuff in notifier.h - ie
the naked word notififer is already taken

Jason
Jason Gunthorpe Dec. 11, 2019, 10:57 p.m. UTC | #8
On Thu, Dec 05, 2019 at 11:03:24AM -0500, Jerome Glisse wrote:

> > struct mmu_notifier_mm (ie the mm->mmu_notifier_mm)
> >    -> mmn_mm
> > struct mm_struct 
> >    -> mm
> > struct mmu_notifier (ie the user subscription to the mm_struct)
> >    -> mn
> > struct mmu_interval_notifier (the other kind of user subscription)
> >    -> mni
> 
> What about "interval" the context should already tell people
> it is related to mmu notifier and thus a notifier. I would
> just remove the notifier suffix, this would match the below
> range.

Interval could be a good replacement for mni in the mm/mmu_notififer
file if we don't do the wholesale rename

> > I think it would be overall nicer with better names for the original
> > structs. Perhaps:
> > 
> >  mmn_* - MMU notifier prefix
> >  mmn_state <- struct mmu_notifier_mm
> >  mmn_subscription (mmn_sub) <- struct mmu_notifier
> >  mmn_range_subscription (mmn_range_sub) <- struct mmu_interval_notifier
> >  mmn_invalidate_desc <- struct mmu_notifier_range
> 
> This looks good.

Well, lets just bite the bullet then and switch it. Do you like
'state'? I thought that was the weakest one

We could use mmnotif as the prefix, this makes the longest:

  struct mmnotif_range_subscription

Which is reasonable enough

> Maybe we can do a semantic patch to do convertion and then Linus
> can easily apply the patch by just re-running the coccinelle.

I tried this last time I renamed everything, it was OK, but it missed
updating the comments. So it still needs some by-hand helping.

I'll make some patches next week when I get back.

Jason
Daniel Vetter Dec. 13, 2019, 10:19 a.m. UTC | #9
On Wed, Dec 11, 2019 at 10:57:13PM +0000, Jason Gunthorpe wrote:
> On Thu, Dec 05, 2019 at 11:03:24AM -0500, Jerome Glisse wrote:
> 
> > > struct mmu_notifier_mm (ie the mm->mmu_notifier_mm)
> > >    -> mmn_mm
> > > struct mm_struct 
> > >    -> mm
> > > struct mmu_notifier (ie the user subscription to the mm_struct)
> > >    -> mn
> > > struct mmu_interval_notifier (the other kind of user subscription)
> > >    -> mni
> > 
> > What about "interval" the context should already tell people
> > it is related to mmu notifier and thus a notifier. I would
> > just remove the notifier suffix, this would match the below
> > range.
> 
> Interval could be a good replacement for mni in the mm/mmu_notififer
> file if we don't do the wholesale rename
> 
> > > I think it would be overall nicer with better names for the original
> > > structs. Perhaps:
> > > 
> > >  mmn_* - MMU notifier prefix
> > >  mmn_state <- struct mmu_notifier_mm
> > >  mmn_subscription (mmn_sub) <- struct mmu_notifier
> > >  mmn_range_subscription (mmn_range_sub) <- struct mmu_interval_notifier
> > >  mmn_invalidate_desc <- struct mmu_notifier_range
> > 
> > This looks good.
> 
> Well, lets just bite the bullet then and switch it. Do you like
> 'state'? I thought that was the weakest one

Since you're asking, here's my bikeshed. I kinda agree _state looks a bit
strange for this, what about a _link suffix in the spirit of

	struct list_head link;

The other common name is "node", but I think that's confusing in the
context of mm code. The purpose of this struct is to link everything
together (and yes it carries also some state, but the main job is to link
a mm_struct to a mmu_notifier). At least for me a _state is configuration
state for a specific object, not something that links a bunch of things
together. But I'm biased on this, since we use that pattern in drm for all
the display state tracking.

Also feel free to ignore my bikeshed :-)

Aside from this I think the proposed names are a solid improvement.
-Daniel

> 
> We could use mmnotif as the prefix, this makes the longest:
> 
>   struct mmnotif_range_subscription
> 
> Which is reasonable enough
> 
> > Maybe we can do a semantic patch to do convertion and then Linus
> > can easily apply the patch by just re-running the coccinelle.
> 
> I tried this last time I renamed everything, it was OK, but it missed
> updating the comments. So it still needs some by-hand helping.
> 
> I'll make some patches next week when I get back.
> 
> Jason
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Jason Gunthorpe Dec. 18, 2019, 2:59 p.m. UTC | #10
On Fri, Dec 13, 2019 at 11:19:16AM +0100, Daniel Vetter wrote:
> On Wed, Dec 11, 2019 at 10:57:13PM +0000, Jason Gunthorpe wrote:
> > On Thu, Dec 05, 2019 at 11:03:24AM -0500, Jerome Glisse wrote:
> > 
> > > > struct mmu_notifier_mm (ie the mm->mmu_notifier_mm)
> > > >    -> mmn_mm
> > > > struct mm_struct 
> > > >    -> mm
> > > > struct mmu_notifier (ie the user subscription to the mm_struct)
> > > >    -> mn
> > > > struct mmu_interval_notifier (the other kind of user subscription)
> > > >    -> mni
> > > 
> > > What about "interval" the context should already tell people
> > > it is related to mmu notifier and thus a notifier. I would
> > > just remove the notifier suffix, this would match the below
> > > range.
> > 
> > Interval could be a good replacement for mni in the mm/mmu_notififer
> > file if we don't do the wholesale rename
> > 
> > > > I think it would be overall nicer with better names for the original
> > > > structs. Perhaps:
> > > > 
> > > >  mmn_* - MMU notifier prefix
> > > >  mmn_state <- struct mmu_notifier_mm
> > > >  mmn_subscription (mmn_sub) <- struct mmu_notifier
> > > >  mmn_range_subscription (mmn_range_sub) <- struct mmu_interval_notifier
> > > >  mmn_invalidate_desc <- struct mmu_notifier_range
> > > 
> > > This looks good.
> > 
> > Well, lets just bite the bullet then and switch it. Do you like
> > 'state'? I thought that was the weakest one
> 
> Since you're asking, here's my bikeshed. I kinda agree _state looks a bit
> strange for this, what about a _link suffix in the spirit of

Do you think calling it 'mmn_subscriptions' is clear?

Ie a struct mmn_subscriptions holds the lists of struct
mmn_subscription and struct mmn_range_subscription?

Jason
Linus Torvalds Dec. 18, 2019, 4:53 p.m. UTC | #11
On Wed, Dec 18, 2019 at 6:59 AM Jason Gunthorpe <jgg@mellanox.com> wrote:
>
> Do you think calling it 'mmn_subscriptions' is clear?

Why do you want that "mmn"?

Guys, the "mmn" part is clear from the _context_. The function name is

When the function name is something like "mmu_interval_read_begin()",
and the filename is "mm/mmu_notifier.c", you do NOT NEED silly
prefixes like "mmn" for local variables.

They add NOTHING.

And they make the code an illegible mess.

Yes, global function names need to be unique, and if they aren't
really core, they want some prefix that explains the context, because
global functions are called from _outside_ the context that explains
them.

But if it's a "struct mmu_interval_notifier" pointer, and it's inside
a file that is all about these pointers, it shouldn't be called
"mmn_xyz".  That's not a name. That's line noise.

So call it a "notifier". Maybe even an "interval_notifier" if you
don't mind the typing. Name it by something _descriptive_. And if you
want.

And "subscriptions" is a lovely name. What does the "mmn" buy you?

Just to clarify: the names I really hated were the local variable
names (and the argument names) that were all entirely within the
context of mm/mmu_notifier.c. Calling something "mmn_mm" is a random
jumble of letters that looks more like you're humming than you're
speaking.

Don't mumble. Speak _clearly_.

The other side of "short names" is that some non-local conventions
exist because they are _so_ global. So if it's just a mm pointer, call
it "mm". We do have some very core concepts in the kernel that
permeate _everything_, and those core things we tend to have very
short names for. So whenever you're working with VM code, you'll see
lots of small names like "mm", "vma", "pte" etc. They aren't exactly
clear, but they are _globally_ something you read and learn when you
work on the Linux VM code.

That's very diofferent from "mmn" - the "mmn" thing isn't some global
shorthand, it is just a local abomination.

So "notifier_mm" makes sense - it's the mm for a notifier. But
"mmn_notifier" does not, because "mmn" only makes sense in a local
context, and in that local context it's not any new information at
all.

See the difference? Two shorthands, but one makes sense and adds
information, while the other is just unnecessary and pointless and
doesn't add anything at all.

                Linus
Jason Gunthorpe Dec. 18, 2019, 6:37 p.m. UTC | #12
On Wed, Dec 18, 2019 at 08:53:05AM -0800, Linus Torvalds wrote:
> On Wed, Dec 18, 2019 at 6:59 AM Jason Gunthorpe <jgg@mellanox.com> wrote:

> Yes, global function names need to be unique, and if they aren't
> really core, they want some prefix that explains the context, because
> global functions are called from _outside_ the context that explains
> them.

Yes, in this thread I have mostly talked about changing the global
struct names, and for that mmn_ is the context explaining prefix.

> But if it's a "struct mmu_interval_notifier" pointer, and it's inside
> a file that is all about these pointers, it shouldn't be called
> "mmn_xyz".  That's not a name. That's line noise.
> 
> So call it a "notifier". Maybe even an "interval_notifier" if you
> don't mind the typing. Name it by something _descriptive_. And if you
> want.
> 
> And "subscriptions" is a lovely name. What does the "mmn" buy you?

To be clear, I was proposing this as the struct name:

  'struct mmu_notifier_mm' becomes 'struct mmn_subscriptions'

(and similar for other mmu_notifier_x* and mmu_x_notifier patterns)

From there we now have a natural and readable local variable name like
'subscriptions' within mm/mmu_notifier.c

I've just started looking at this in detail, but it seems sticking
with 'mmu_notifier_' as the global prefix will avoid a fair amount of
churn. So lets not try to shorten it to mmn_ as the global prefix.

> Just to clarify: the names I really hated were the local variable
> names (and the argument names) that were all entirely within the
> context of mm/mmu_notifier.c. Calling something "mmn_mm" is a random
> jumble of letters that looks more like you're humming than you're
> speaking.

Yes, I understood - I've approached trying to have good names for the
variables via having good names for their struct's.

Below is what I'm suggesting for the first patch. I am intending a
patch series to make the naming better across mmu_notifier.h, this
would be the first.

Next patches would probably replace 'mn' with 'list_sub' (struct
mmu_notifier_subscription) and 'mni' with 'range_sub' (struct
mmu_notifier_range_subscription).

Thus we have code working on a 'list_sub/range_sub' and storing it in
a 'subscriptions'. 

I think this is what you are looking for?

From c656d0862dedcc2f5f4beda129a2ac51c892be7e Mon Sep 17 00:00:00 2001
From: Jason Gunthorpe <jgg@mellanox.com>
Date: Wed, 18 Dec 2019 13:40:35 -0400
Subject: [PATCH] mm/mmu_notifier: Rename struct mmu_notifier_mm to
 mmu_notifier_subscriptions

The name mmu_notifier_mm implies that the thing is a mm_struct pointer,
and is difficult to abbreviate. The struct is actually holding the
interval tree and hlist containing the notifiers subscribed to a mm.

Use 'subscriptions' as the variable name for this struct instead of the
really terrible and misleading 'mmn_mm'.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 include/linux/mm_types.h     |   2 +-
 include/linux/mmu_notifier.h |  18 +-
 kernel/fork.c                |   4 +-
 mm/debug.c                   |   4 +-
 mm/mmu_notifier.c            | 322 ++++++++++++++++++-----------------
 5 files changed, 182 insertions(+), 168 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 270aa8fd2800b4..e87bb864bdb29a 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -490,7 +490,7 @@ struct mm_struct {
 		/* store ref to file /proc/<pid>/exe symlink points to */
 		struct file __rcu *exe_file;
 #ifdef CONFIG_MMU_NOTIFIER
-		struct mmu_notifier_mm *mmu_notifier_mm;
+		struct mmu_notifier_subscriptions *notifier_subscriptions;
 #endif
 #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS
 		pgtable_t pmd_huge_pte; /* protected by page_table_lock */
diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 9e6caa8ecd1938..a302925fbc6177 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -8,7 +8,7 @@
 #include <linux/srcu.h>
 #include <linux/interval_tree.h>
 
-struct mmu_notifier_mm;
+struct mmu_notifier_subscriptions;
 struct mmu_notifier;
 struct mmu_notifier_range;
 struct mmu_interval_notifier;
@@ -265,7 +265,7 @@ struct mmu_notifier_range {
 
 static inline int mm_has_notifiers(struct mm_struct *mm)
 {
-	return unlikely(mm->mmu_notifier_mm);
+	return unlikely(mm->notifier_subscriptions);
 }
 
 struct mmu_notifier *mmu_notifier_get_locked(const struct mmu_notifier_ops *ops,
@@ -364,7 +364,7 @@ static inline bool mmu_interval_check_retry(struct mmu_interval_notifier *mni,
 	return READ_ONCE(mni->invalidate_seq) != seq;
 }
 
-extern void __mmu_notifier_mm_destroy(struct mm_struct *mm);
+extern void __mmu_notifier_subscriptions_destroy(struct mm_struct *mm);
 extern void __mmu_notifier_release(struct mm_struct *mm);
 extern int __mmu_notifier_clear_flush_young(struct mm_struct *mm,
 					  unsigned long start,
@@ -480,15 +480,15 @@ static inline void mmu_notifier_invalidate_range(struct mm_struct *mm,
 		__mmu_notifier_invalidate_range(mm, start, end);
 }
 
-static inline void mmu_notifier_mm_init(struct mm_struct *mm)
+static inline void mmu_notifier_subscriptions_init(struct mm_struct *mm)
 {
-	mm->mmu_notifier_mm = NULL;
+	mm->notifier_subscriptions = NULL;
 }
 
-static inline void mmu_notifier_mm_destroy(struct mm_struct *mm)
+static inline void mmu_notifier_subscriptions_destroy(struct mm_struct *mm)
 {
 	if (mm_has_notifiers(mm))
-		__mmu_notifier_mm_destroy(mm);
+		__mmu_notifier_subscriptions_destroy(mm);
 }
 
 
@@ -692,11 +692,11 @@ static inline void mmu_notifier_invalidate_range(struct mm_struct *mm,
 {
 }
 
-static inline void mmu_notifier_mm_init(struct mm_struct *mm)
+static inline void mmu_notifier_subscriptions_init(struct mm_struct *mm)
 {
 }
 
-static inline void mmu_notifier_mm_destroy(struct mm_struct *mm)
+static inline void mmu_notifier_subscriptions_destroy(struct mm_struct *mm)
 {
 }
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 2508a4f238a3f3..047865086cdf74 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -692,7 +692,7 @@ void __mmdrop(struct mm_struct *mm)
 	WARN_ON_ONCE(mm == current->active_mm);
 	mm_free_pgd(mm);
 	destroy_context(mm);
-	mmu_notifier_mm_destroy(mm);
+	mmu_notifier_subscriptions_destroy(mm);
 	check_mm(mm);
 	put_user_ns(mm->user_ns);
 	free_mm(mm);
@@ -1025,7 +1025,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
 	mm_init_aio(mm);
 	mm_init_owner(mm, p);
 	RCU_INIT_POINTER(mm->exe_file, NULL);
-	mmu_notifier_mm_init(mm);
+	mmu_notifier_subscriptions_init(mm);
 	init_tlb_flush_pending(mm);
 #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS
 	mm->pmd_huge_pte = NULL;
diff --git a/mm/debug.c b/mm/debug.c
index 0461df1207cb09..74ee73cf7079a5 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -153,7 +153,7 @@ void dump_mm(const struct mm_struct *mm)
 #endif
 		"exe_file %px\n"
 #ifdef CONFIG_MMU_NOTIFIER
-		"mmu_notifier_mm %px\n"
+		"notifier_subscriptions %px\n"
 #endif
 #ifdef CONFIG_NUMA_BALANCING
 		"numa_next_scan %lu numa_scan_offset %lu numa_scan_seq %d\n"
@@ -185,7 +185,7 @@ void dump_mm(const struct mm_struct *mm)
 #endif
 		mm->exe_file,
 #ifdef CONFIG_MMU_NOTIFIER
-		mm->mmu_notifier_mm,
+		mm->notifier_subscriptions,
 #endif
 #ifdef CONFIG_NUMA_BALANCING
 		mm->numa_next_scan, mm->numa_scan_offset, mm->numa_scan_seq,
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index f76ea05b1cb011..02de878964a787 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -29,12 +29,12 @@ struct lockdep_map __mmu_notifier_invalidate_range_start_map = {
 #endif
 
 /*
- * The mmu notifier_mm structure is allocated and installed in
- * mm->mmu_notifier_mm inside the mm_take_all_locks() protected
+ * The mmu_notifier_subscriptions structure is allocated and installed in
+ * mm->notifier_subscriptions inside the mm_take_all_locks() protected
  * critical section and it's released only when mm_count reaches zero
  * in mmdrop().
  */
-struct mmu_notifier_mm {
+struct mmu_notifier_subscriptions {
 	/* all mmu notifiers registered in this mm are queued in this list */
 	struct hlist_head list;
 	bool has_itree;
@@ -75,7 +75,7 @@ struct mmu_notifier_mm {
  *  - some range on the mm_struct is being invalidated
  *  - the itree is allowed to change
  *
- * Operations on mmu_notifier_mm->invalidate_seq (under spinlock):
+ * Operations on notifier_subscriptions->invalidate_seq (under spinlock):
  *    seq |= 1  # Begin writing
  *    seq++     # Release the writing state
  *    seq & 1   # True if a writer exists
@@ -83,32 +83,33 @@ struct mmu_notifier_mm {
  * The later state avoids some expensive work on inv_end in the common case of
  * no mni monitoring the VA.
  */
-static bool mn_itree_is_invalidating(struct mmu_notifier_mm *mmn_mm)
+static bool
+mn_itree_is_invalidating(struct mmu_notifier_subscriptions *subscriptions)
 {
-	lockdep_assert_held(&mmn_mm->lock);
-	return mmn_mm->invalidate_seq & 1;
+	lockdep_assert_held(&subscriptions->lock);
+	return subscriptions->invalidate_seq & 1;
 }
 
 static struct mmu_interval_notifier *
-mn_itree_inv_start_range(struct mmu_notifier_mm *mmn_mm,
+mn_itree_inv_start_range(struct mmu_notifier_subscriptions *subscriptions,
 			 const struct mmu_notifier_range *range,
 			 unsigned long *seq)
 {
 	struct interval_tree_node *node;
 	struct mmu_interval_notifier *res = NULL;
 
-	spin_lock(&mmn_mm->lock);
-	mmn_mm->active_invalidate_ranges++;
-	node = interval_tree_iter_first(&mmn_mm->itree, range->start,
+	spin_lock(&subscriptions->lock);
+	subscriptions->active_invalidate_ranges++;
+	node = interval_tree_iter_first(&subscriptions->itree, range->start,
 					range->end - 1);
 	if (node) {
-		mmn_mm->invalidate_seq |= 1;
+		subscriptions->invalidate_seq |= 1;
 		res = container_of(node, struct mmu_interval_notifier,
 				   interval_tree);
 	}
 
-	*seq = mmn_mm->invalidate_seq;
-	spin_unlock(&mmn_mm->lock);
+	*seq = subscriptions->invalidate_seq;
+	spin_unlock(&subscriptions->lock);
 	return res;
 }
 
@@ -125,20 +126,20 @@ mn_itree_inv_next(struct mmu_interval_notifier *mni,
 	return container_of(node, struct mmu_interval_notifier, interval_tree);
 }
 
-static void mn_itree_inv_end(struct mmu_notifier_mm *mmn_mm)
+static void mn_itree_inv_end(struct mmu_notifier_subscriptions *subscriptions)
 {
 	struct mmu_interval_notifier *mni;
 	struct hlist_node *next;
 
-	spin_lock(&mmn_mm->lock);
-	if (--mmn_mm->active_invalidate_ranges ||
-	    !mn_itree_is_invalidating(mmn_mm)) {
-		spin_unlock(&mmn_mm->lock);
+	spin_lock(&subscriptions->lock);
+	if (--subscriptions->active_invalidate_ranges ||
+	    !mn_itree_is_invalidating(subscriptions)) {
+		spin_unlock(&subscriptions->lock);
 		return;
 	}
 
 	/* Make invalidate_seq even */
-	mmn_mm->invalidate_seq++;
+	subscriptions->invalidate_seq++;
 
 	/*
 	 * The inv_end incorporates a deferred mechanism like rtnl_unlock().
@@ -146,19 +147,19 @@ static void mn_itree_inv_end(struct mmu_notifier_mm *mmn_mm)
 	 * they are progressed. This arrangement for tree updates is used to
 	 * avoid using a blocking lock during invalidate_range_start.
 	 */
-	hlist_for_each_entry_safe(mni, next, &mmn_mm->deferred_list,
+	hlist_for_each_entry_safe(mni, next, &subscriptions->deferred_list,
 				  deferred_item) {
 		if (RB_EMPTY_NODE(&mni->interval_tree.rb))
 			interval_tree_insert(&mni->interval_tree,
-					     &mmn_mm->itree);
+					     &subscriptions->itree);
 		else
 			interval_tree_remove(&mni->interval_tree,
-					     &mmn_mm->itree);
+					     &subscriptions->itree);
 		hlist_del(&mni->deferred_item);
 	}
-	spin_unlock(&mmn_mm->lock);
+	spin_unlock(&subscriptions->lock);
 
-	wake_up_all(&mmn_mm->wq);
+	wake_up_all(&subscriptions->wq);
 }
 
 /**
@@ -182,7 +183,8 @@ static void mn_itree_inv_end(struct mmu_notifier_mm *mmn_mm)
  */
 unsigned long mmu_interval_read_begin(struct mmu_interval_notifier *mni)
 {
-	struct mmu_notifier_mm *mmn_mm = mni->mm->mmu_notifier_mm;
+	struct mmu_notifier_subscriptions *subscriptions =
+		mni->mm->notifier_subscriptions;
 	unsigned long seq;
 	bool is_invalidating;
 
@@ -190,17 +192,18 @@ unsigned long mmu_interval_read_begin(struct mmu_interval_notifier *mni)
 	 * If the mni has a different seq value under the user_lock than we
 	 * started with then it has collided.
 	 *
-	 * If the mni currently has the same seq value as the mmn_mm seq, then
-	 * it is currently between invalidate_start/end and is colliding.
+	 * If the mni currently has the same seq value as the subscriptions
+	 * seq, then it is currently between invalidate_start/end and is
+	 * colliding.
 	 *
 	 * The locking looks broadly like this:
 	 *   mn_tree_invalidate_start():          mmu_interval_read_begin():
 	 *                                         spin_lock
 	 *                                          seq = READ_ONCE(mni->invalidate_seq);
-	 *                                          seq == mmn_mm->invalidate_seq
+	 *                                          seq == subs->invalidate_seq
 	 *                                         spin_unlock
 	 *    spin_lock
-	 *     seq = ++mmn_mm->invalidate_seq
+	 *     seq = ++subscriptions->invalidate_seq
 	 *    spin_unlock
 	 *     op->invalidate_range():
 	 *       user_lock
@@ -212,7 +215,7 @@ unsigned long mmu_interval_read_begin(struct mmu_interval_notifier *mni)
 	 *
 	 *   mn_itree_inv_end():
 	 *    spin_lock
-	 *     seq = ++mmn_mm->invalidate_seq
+	 *     seq = ++subscriptions->invalidate_seq
 	 *    spin_unlock
 	 *
 	 *                                        user_lock
@@ -224,24 +227,24 @@ unsigned long mmu_interval_read_begin(struct mmu_interval_notifier *mni)
 	 * eventual mmu_interval_read_retry(), which provides a barrier via the
 	 * user_lock.
 	 */
-	spin_lock(&mmn_mm->lock);
+	spin_lock(&subscriptions->lock);
 	/* Pairs with the WRITE_ONCE in mmu_interval_set_seq() */
 	seq = READ_ONCE(mni->invalidate_seq);
-	is_invalidating = seq == mmn_mm->invalidate_seq;
-	spin_unlock(&mmn_mm->lock);
+	is_invalidating = seq == subscriptions->invalidate_seq;
+	spin_unlock(&subscriptions->lock);
 
 	/*
 	 * mni->invalidate_seq must always be set to an odd value via
 	 * mmu_interval_set_seq() using the provided cur_seq from
 	 * mn_itree_inv_start_range(). This ensures that if seq does wrap we
 	 * will always clear the below sleep in some reasonable time as
-	 * mmn_mm->invalidate_seq is even in the idle state.
+	 * subscriptions->invalidate_seq is even in the idle state.
 	 */
 	lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
 	lock_map_release(&__mmu_notifier_invalidate_range_start_map);
 	if (is_invalidating)
-		wait_event(mmn_mm->wq,
-			   READ_ONCE(mmn_mm->invalidate_seq) != seq);
+		wait_event(subscriptions->wq,
+			   READ_ONCE(subscriptions->invalidate_seq) != seq);
 
 	/*
 	 * Notice that mmu_interval_read_retry() can already be true at this
@@ -253,7 +256,7 @@ unsigned long mmu_interval_read_begin(struct mmu_interval_notifier *mni)
 }
 EXPORT_SYMBOL_GPL(mmu_interval_read_begin);
 
-static void mn_itree_release(struct mmu_notifier_mm *mmn_mm,
+static void mn_itree_release(struct mmu_notifier_subscriptions *subscriptions,
 			     struct mm_struct *mm)
 {
 	struct mmu_notifier_range range = {
@@ -267,13 +270,13 @@ static void mn_itree_release(struct mmu_notifier_mm *mmn_mm,
 	unsigned long cur_seq;
 	bool ret;
 
-	for (mni = mn_itree_inv_start_range(mmn_mm, &range, &cur_seq); mni;
-	     mni = mn_itree_inv_next(mni, &range)) {
+	for (mni = mn_itree_inv_start_range(subscriptions, &range, &cur_seq);
+	     mni; mni = mn_itree_inv_next(mni, &range)) {
 		ret = mni->ops->invalidate(mni, &range, cur_seq);
 		WARN_ON(!ret);
 	}
 
-	mn_itree_inv_end(mmn_mm);
+	mn_itree_inv_end(subscriptions);
 }
 
 /*
@@ -283,12 +286,12 @@ static void mn_itree_release(struct mmu_notifier_mm *mmn_mm,
  * in parallel despite there being no task using this mm any more,
  * through the vmas outside of the exit_mmap context, such as with
  * vmtruncate. This serializes against mmu_notifier_unregister with
- * the mmu_notifier_mm->lock in addition to SRCU and it serializes
- * against the other mmu notifiers with SRCU. struct mmu_notifier_mm
+ * the notifier_subscriptions->lock in addition to SRCU and it serializes
+ * against the other mmu notifiers with SRCU. struct mmu_notifier_subscriptions
  * can't go away from under us as exit_mmap holds an mm_count pin
  * itself.
  */
-static void mn_hlist_release(struct mmu_notifier_mm *mmn_mm,
+static void mn_hlist_release(struct mmu_notifier_subscriptions *subscriptions,
 			     struct mm_struct *mm)
 {
 	struct mmu_notifier *mn;
@@ -299,7 +302,7 @@ static void mn_hlist_release(struct mmu_notifier_mm *mmn_mm,
 	 * ->release returns.
 	 */
 	id = srcu_read_lock(&srcu);
-	hlist_for_each_entry_rcu(mn, &mmn_mm->list, hlist)
+	hlist_for_each_entry_rcu(mn, &subscriptions->list, hlist)
 		/*
 		 * If ->release runs before mmu_notifier_unregister it must be
 		 * handled, as it's the only way for the driver to flush all
@@ -309,9 +312,9 @@ static void mn_hlist_release(struct mmu_notifier_mm *mmn_mm,
 		if (mn->ops->release)
 			mn->ops->release(mn, mm);
 
-	spin_lock(&mmn_mm->lock);
-	while (unlikely(!hlist_empty(&mmn_mm->list))) {
-		mn = hlist_entry(mmn_mm->list.first, struct mmu_notifier,
+	spin_lock(&subscriptions->lock);
+	while (unlikely(!hlist_empty(&subscriptions->list))) {
+		mn = hlist_entry(subscriptions->list.first, struct mmu_notifier,
 				 hlist);
 		/*
 		 * We arrived before mmu_notifier_unregister so
@@ -321,7 +324,7 @@ static void mn_hlist_release(struct mmu_notifier_mm *mmn_mm,
 		 */
 		hlist_del_init_rcu(&mn->hlist);
 	}
-	spin_unlock(&mmn_mm->lock);
+	spin_unlock(&subscriptions->lock);
 	srcu_read_unlock(&srcu, id);
 
 	/*
@@ -330,21 +333,22 @@ static void mn_hlist_release(struct mmu_notifier_mm *mmn_mm,
 	 * until the ->release method returns, if it was invoked by
 	 * mmu_notifier_unregister.
 	 *
-	 * The mmu_notifier_mm can't go away from under us because one mm_count
-	 * is held by exit_mmap.
+	 * The notifier_subscriptions can't go away from under us because
+	 * one mm_count is held by exit_mmap.
 	 */
 	synchronize_srcu(&srcu);
 }
 
 void __mmu_notifier_release(struct mm_struct *mm)
 {
-	struct mmu_notifier_mm *mmn_mm = mm->mmu_notifier_mm;
+	struct mmu_notifier_subscriptions *subscriptions =
+		mm->notifier_subscriptions;
 
-	if (mmn_mm->has_itree)
-		mn_itree_release(mmn_mm, mm);
+	if (subscriptions->has_itree)
+		mn_itree_release(subscriptions, mm);
 
-	if (!hlist_empty(&mmn_mm->list))
-		mn_hlist_release(mmn_mm, mm);
+	if (!hlist_empty(&subscriptions->list))
+		mn_hlist_release(subscriptions, mm);
 }
 
 /*
@@ -360,7 +364,7 @@ int __mmu_notifier_clear_flush_young(struct mm_struct *mm,
 	int young = 0, id;
 
 	id = srcu_read_lock(&srcu);
-	hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) {
+	hlist_for_each_entry_rcu(mn, &mm->notifier_subscriptions->list, hlist) {
 		if (mn->ops->clear_flush_young)
 			young |= mn->ops->clear_flush_young(mn, mm, start, end);
 	}
@@ -377,7 +381,7 @@ int __mmu_notifier_clear_young(struct mm_struct *mm,
 	int young = 0, id;
 
 	id = srcu_read_lock(&srcu);
-	hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) {
+	hlist_for_each_entry_rcu(mn, &mm->notifier_subscriptions->list, hlist) {
 		if (mn->ops->clear_young)
 			young |= mn->ops->clear_young(mn, mm, start, end);
 	}
@@ -393,7 +397,7 @@ int __mmu_notifier_test_young(struct mm_struct *mm,
 	int young = 0, id;
 
 	id = srcu_read_lock(&srcu);
-	hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) {
+	hlist_for_each_entry_rcu(mn, &mm->notifier_subscriptions->list, hlist) {
 		if (mn->ops->test_young) {
 			young = mn->ops->test_young(mn, mm, address);
 			if (young)
@@ -412,21 +416,22 @@ void __mmu_notifier_change_pte(struct mm_struct *mm, unsigned long address,
 	int id;
 
 	id = srcu_read_lock(&srcu);
-	hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) {
+	hlist_for_each_entry_rcu(mn, &mm->notifier_subscriptions->list,
+				 hlist) {
 		if (mn->ops->change_pte)
 			mn->ops->change_pte(mn, mm, address, pte);
 	}
 	srcu_read_unlock(&srcu, id);
 }
 
-static int mn_itree_invalidate(struct mmu_notifier_mm *mmn_mm,
+static int mn_itree_invalidate(struct mmu_notifier_subscriptions *subscriptions,
 			       const struct mmu_notifier_range *range)
 {
 	struct mmu_interval_notifier *mni;
 	unsigned long cur_seq;
 
-	for (mni = mn_itree_inv_start_range(mmn_mm, range, &cur_seq); mni;
-	     mni = mn_itree_inv_next(mni, range)) {
+	for (mni = mn_itree_inv_start_range(subscriptions, range, &cur_seq);
+	     mni; mni = mn_itree_inv_next(mni, range)) {
 		bool ret;
 
 		ret = mni->ops->invalidate(mni, range, cur_seq);
@@ -443,19 +448,20 @@ static int mn_itree_invalidate(struct mmu_notifier_mm *mmn_mm,
 	 * On -EAGAIN the non-blocking caller is not allowed to call
 	 * invalidate_range_end()
 	 */
-	mn_itree_inv_end(mmn_mm);
+	mn_itree_inv_end(subscriptions);
 	return -EAGAIN;
 }
 
-static int mn_hlist_invalidate_range_start(struct mmu_notifier_mm *mmn_mm,
-					   struct mmu_notifier_range *range)
+static int mn_hlist_invalidate_range_start(
+	struct mmu_notifier_subscriptions *subscriptions,
+	struct mmu_notifier_range *range)
 {
 	struct mmu_notifier *mn;
 	int ret = 0;
 	int id;
 
 	id = srcu_read_lock(&srcu);
-	hlist_for_each_entry_rcu(mn, &mmn_mm->list, hlist) {
+	hlist_for_each_entry_rcu(mn, &subscriptions->list, hlist) {
 		if (mn->ops->invalidate_range_start) {
 			int _ret;
 
@@ -481,28 +487,29 @@ static int mn_hlist_invalidate_range_start(struct mmu_notifier_mm *mmn_mm,
 
 int __mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range)
 {
-	struct mmu_notifier_mm *mmn_mm = range->mm->mmu_notifier_mm;
+	struct mmu_notifier_subscriptions *subscriptions =
+		range->mm->notifier_subscriptions;
 	int ret;
 
-	if (mmn_mm->has_itree) {
-		ret = mn_itree_invalidate(mmn_mm, range);
+	if (subscriptions->has_itree) {
+		ret = mn_itree_invalidate(subscriptions, range);
 		if (ret)
 			return ret;
 	}
-	if (!hlist_empty(&mmn_mm->list))
-		return mn_hlist_invalidate_range_start(mmn_mm, range);
+	if (!hlist_empty(&subscriptions->list))
+		return mn_hlist_invalidate_range_start(subscriptions, range);
 	return 0;
 }
 
-static void mn_hlist_invalidate_end(struct mmu_notifier_mm *mmn_mm,
-				    struct mmu_notifier_range *range,
-				    bool only_end)
+static void
+mn_hlist_invalidate_end(struct mmu_notifier_subscriptions *subscriptions,
+			struct mmu_notifier_range *range, bool only_end)
 {
 	struct mmu_notifier *mn;
 	int id;
 
 	id = srcu_read_lock(&srcu);
-	hlist_for_each_entry_rcu(mn, &mmn_mm->list, hlist) {
+	hlist_for_each_entry_rcu(mn, &subscriptions->list, hlist) {
 		/*
 		 * Call invalidate_range here too to avoid the need for the
 		 * subsystem of having to register an invalidate_range_end
@@ -534,14 +541,15 @@ static void mn_hlist_invalidate_end(struct mmu_notifier_mm *mmn_mm,
 void __mmu_notifier_invalidate_range_end(struct mmu_notifier_range *range,
 					 bool only_end)
 {
-	struct mmu_notifier_mm *mmn_mm = range->mm->mmu_notifier_mm;
+	struct mmu_notifier_subscriptions *subscriptions =
+		range->mm->notifier_subscriptions;
 
 	lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
-	if (mmn_mm->has_itree)
-		mn_itree_inv_end(mmn_mm);
+	if (subscriptions->has_itree)
+		mn_itree_inv_end(subscriptions);
 
-	if (!hlist_empty(&mmn_mm->list))
-		mn_hlist_invalidate_end(mmn_mm, range, only_end);
+	if (!hlist_empty(&subscriptions->list))
+		mn_hlist_invalidate_end(subscriptions, range, only_end);
 	lock_map_release(&__mmu_notifier_invalidate_range_start_map);
 }
 
@@ -552,7 +560,7 @@ void __mmu_notifier_invalidate_range(struct mm_struct *mm,
 	int id;
 
 	id = srcu_read_lock(&srcu);
-	hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) {
+	hlist_for_each_entry_rcu(mn, &mm->notifier_subscriptions->list, hlist) {
 		if (mn->ops->invalidate_range)
 			mn->ops->invalidate_range(mn, mm, start, end);
 	}
@@ -566,7 +574,7 @@ void __mmu_notifier_invalidate_range(struct mm_struct *mm,
  */
 int __mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
 {
-	struct mmu_notifier_mm *mmu_notifier_mm = NULL;
+	struct mmu_notifier_subscriptions *subscriptions = NULL;
 	int ret;
 
 	lockdep_assert_held_write(&mm->mmap_sem);
@@ -579,23 +587,23 @@ int __mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
 		fs_reclaim_release(GFP_KERNEL);
 	}
 
-	if (!mm->mmu_notifier_mm) {
+	if (!mm->notifier_subscriptions) {
 		/*
 		 * kmalloc cannot be called under mm_take_all_locks(), but we
-		 * know that mm->mmu_notifier_mm can't change while we hold
-		 * the write side of the mmap_sem.
+		 * know that mm->notifier_subscriptions can't change while we
+		 * hold the write side of the mmap_sem.
 		 */
-		mmu_notifier_mm =
-			kzalloc(sizeof(struct mmu_notifier_mm), GFP_KERNEL);
-		if (!mmu_notifier_mm)
+		subscriptions = kzalloc(
+			sizeof(struct mmu_notifier_subscriptions), GFP_KERNEL);
+		if (!subscriptions)
 			return -ENOMEM;
 
-		INIT_HLIST_HEAD(&mmu_notifier_mm->list);
-		spin_lock_init(&mmu_notifier_mm->lock);
-		mmu_notifier_mm->invalidate_seq = 2;
-		mmu_notifier_mm->itree = RB_ROOT_CACHED;
-		init_waitqueue_head(&mmu_notifier_mm->wq);
-		INIT_HLIST_HEAD(&mmu_notifier_mm->deferred_list);
+		INIT_HLIST_HEAD(&subscriptions->list);
+		spin_lock_init(&subscriptions->lock);
+		subscriptions->invalidate_seq = 2;
+		subscriptions->itree = RB_ROOT_CACHED;
+		init_waitqueue_head(&subscriptions->wq);
+		INIT_HLIST_HEAD(&subscriptions->deferred_list);
 	}
 
 	ret = mm_take_all_locks(mm);
@@ -610,15 +618,16 @@ int __mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
 	 * We can't race against any other mmu notifier method either
 	 * thanks to mm_take_all_locks().
 	 *
-	 * release semantics on the initialization of the mmu_notifier_mm's
-	 * contents are provided for unlocked readers.  acquire can only be
-	 * used while holding the mmgrab or mmget, and is safe because once
-	 * created the mmu_notififer_mm is not freed until the mm is
-	 * destroyed.  As above, users holding the mmap_sem or one of the
+	 * release semantics on the initialization of the
+	 * mmu_notifier_subscriptions's contents are provided for unlocked
+	 * readers.  acquire can only be used while holding the mmgrab or
+	 * mmget, and is safe because once created the
+	 * mmu_notifier_subscriptions is not freed until the mm is destroyed.
+	 * As above, users holding the mmap_sem or one of the
 	 * mm_take_all_locks() do not need to use acquire semantics.
 	 */
-	if (mmu_notifier_mm)
-		smp_store_release(&mm->mmu_notifier_mm, mmu_notifier_mm);
+	if (subscriptions)
+		smp_store_release(&mm->notifier_subscriptions, subscriptions);
 
 	if (mn) {
 		/* Pairs with the mmdrop in mmu_notifier_unregister_* */
@@ -626,18 +635,19 @@ int __mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
 		mn->mm = mm;
 		mn->users = 1;
 
-		spin_lock(&mm->mmu_notifier_mm->lock);
-		hlist_add_head_rcu(&mn->hlist, &mm->mmu_notifier_mm->list);
-		spin_unlock(&mm->mmu_notifier_mm->lock);
+		spin_lock(&mm->notifier_subscriptions->lock);
+		hlist_add_head_rcu(&mn->hlist,
+				   &mm->notifier_subscriptions->list);
+		spin_unlock(&mm->notifier_subscriptions->lock);
 	} else
-		mm->mmu_notifier_mm->has_itree = true;
+		mm->notifier_subscriptions->has_itree = true;
 
 	mm_drop_all_locks(mm);
 	BUG_ON(atomic_read(&mm->mm_users) <= 0);
 	return 0;
 
 out_clean:
-	kfree(mmu_notifier_mm);
+	kfree(subscriptions);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(__mmu_notifier_register);
@@ -677,8 +687,9 @@ find_get_mmu_notifier(struct mm_struct *mm, const struct mmu_notifier_ops *ops)
 {
 	struct mmu_notifier *mn;
 
-	spin_lock(&mm->mmu_notifier_mm->lock);
-	hlist_for_each_entry_rcu (mn, &mm->mmu_notifier_mm->list, hlist) {
+	spin_lock(&mm->notifier_subscriptions->lock);
+	hlist_for_each_entry_rcu(mn, &mm->notifier_subscriptions->list,
+				 hlist) {
 		if (mn->ops != ops)
 			continue;
 
@@ -686,10 +697,10 @@ find_get_mmu_notifier(struct mm_struct *mm, const struct mmu_notifier_ops *ops)
 			mn->users++;
 		else
 			mn = ERR_PTR(-EOVERFLOW);
-		spin_unlock(&mm->mmu_notifier_mm->lock);
+		spin_unlock(&mm->notifier_subscriptions->lock);
 		return mn;
 	}
-	spin_unlock(&mm->mmu_notifier_mm->lock);
+	spin_unlock(&mm->notifier_subscriptions->lock);
 	return NULL;
 }
 
@@ -718,7 +729,7 @@ struct mmu_notifier *mmu_notifier_get_locked(const struct mmu_notifier_ops *ops,
 
 	lockdep_assert_held_write(&mm->mmap_sem);
 
-	if (mm->mmu_notifier_mm) {
+	if (mm->notifier_subscriptions) {
 		mn = find_get_mmu_notifier(mm, ops);
 		if (mn)
 			return mn;
@@ -739,11 +750,11 @@ struct mmu_notifier *mmu_notifier_get_locked(const struct mmu_notifier_ops *ops,
 EXPORT_SYMBOL_GPL(mmu_notifier_get_locked);
 
 /* this is called after the last mmu_notifier_unregister() returned */
-void __mmu_notifier_mm_destroy(struct mm_struct *mm)
+void __mmu_notifier_subscriptions_destroy(struct mm_struct *mm)
 {
-	BUG_ON(!hlist_empty(&mm->mmu_notifier_mm->list));
-	kfree(mm->mmu_notifier_mm);
-	mm->mmu_notifier_mm = LIST_POISON1; /* debug */
+	BUG_ON(!hlist_empty(&mm->notifier_subscriptions->list));
+	kfree(mm->notifier_subscriptions);
+	mm->notifier_subscriptions = LIST_POISON1; /* debug */
 }
 
 /*
@@ -776,13 +787,13 @@ void mmu_notifier_unregister(struct mmu_notifier *mn, struct mm_struct *mm)
 			mn->ops->release(mn, mm);
 		srcu_read_unlock(&srcu, id);
 
-		spin_lock(&mm->mmu_notifier_mm->lock);
+		spin_lock(&mm->notifier_subscriptions->lock);
 		/*
 		 * Can not use list_del_rcu() since __mmu_notifier_release
 		 * can delete it before we hold the lock.
 		 */
 		hlist_del_init_rcu(&mn->hlist);
-		spin_unlock(&mm->mmu_notifier_mm->lock);
+		spin_unlock(&mm->notifier_subscriptions->lock);
 	}
 
 	/*
@@ -833,23 +844,23 @@ void mmu_notifier_put(struct mmu_notifier *mn)
 {
 	struct mm_struct *mm = mn->mm;
 
-	spin_lock(&mm->mmu_notifier_mm->lock);
+	spin_lock(&mm->notifier_subscriptions->lock);
 	if (WARN_ON(!mn->users) || --mn->users)
 		goto out_unlock;
 	hlist_del_init_rcu(&mn->hlist);
-	spin_unlock(&mm->mmu_notifier_mm->lock);
+	spin_unlock(&mm->notifier_subscriptions->lock);
 
 	call_srcu(&srcu, &mn->rcu, mmu_notifier_free_rcu);
 	return;
 
 out_unlock:
-	spin_unlock(&mm->mmu_notifier_mm->lock);
+	spin_unlock(&mm->notifier_subscriptions->lock);
 }
 EXPORT_SYMBOL_GPL(mmu_notifier_put);
 
 static int __mmu_interval_notifier_insert(
 	struct mmu_interval_notifier *mni, struct mm_struct *mm,
-	struct mmu_notifier_mm *mmn_mm, unsigned long start,
+	struct mmu_notifier_subscriptions *subscriptions, unsigned long start,
 	unsigned long length, const struct mmu_interval_notifier_ops *ops)
 {
 	mni->mm = mm;
@@ -884,29 +895,30 @@ static int __mmu_interval_notifier_insert(
 	 * In all cases the value for the mni->invalidate_seq should be
 	 * odd, see mmu_interval_read_begin()
 	 */
-	spin_lock(&mmn_mm->lock);
-	if (mmn_mm->active_invalidate_ranges) {
-		if (mn_itree_is_invalidating(mmn_mm))
+	spin_lock(&subscriptions->lock);
+	if (subscriptions->active_invalidate_ranges) {
+		if (mn_itree_is_invalidating(subscriptions))
 			hlist_add_head(&mni->deferred_item,
-				       &mmn_mm->deferred_list);
+				       &subscriptions->deferred_list);
 		else {
-			mmn_mm->invalidate_seq |= 1;
+			subscriptions->invalidate_seq |= 1;
 			interval_tree_insert(&mni->interval_tree,
-					     &mmn_mm->itree);
+					     &subscriptions->itree);
 		}
-		mni->invalidate_seq = mmn_mm->invalidate_seq;
+		mni->invalidate_seq = subscriptions->invalidate_seq;
 	} else {
-		WARN_ON(mn_itree_is_invalidating(mmn_mm));
+		WARN_ON(mn_itree_is_invalidating(subscriptions));
 		/*
 		 * The starting seq for a mni not under invalidation should be
 		 * odd, not equal to the current invalidate_seq and
 		 * invalidate_seq should not 'wrap' to the new seq any time
 		 * soon.
 		 */
-		mni->invalidate_seq = mmn_mm->invalidate_seq - 1;
-		interval_tree_insert(&mni->interval_tree, &mmn_mm->itree);
+		mni->invalidate_seq = subscriptions->invalidate_seq - 1;
+		interval_tree_insert(&mni->interval_tree,
+				     &subscriptions->itree);
 	}
-	spin_unlock(&mmn_mm->lock);
+	spin_unlock(&subscriptions->lock);
 	return 0;
 }
 
@@ -930,20 +942,20 @@ int mmu_interval_notifier_insert(struct mmu_interval_notifier *mni,
 				 unsigned long length,
 				 const struct mmu_interval_notifier_ops *ops)
 {
-	struct mmu_notifier_mm *mmn_mm;
+	struct mmu_notifier_subscriptions *subscriptions;
 	int ret;
 
 	might_lock(&mm->mmap_sem);
 
-	mmn_mm = smp_load_acquire(&mm->mmu_notifier_mm);
-	if (!mmn_mm || !mmn_mm->has_itree) {
+	subscriptions = smp_load_acquire(&mm->notifier_subscriptions);
+	if (!subscriptions || !subscriptions->has_itree) {
 		ret = mmu_notifier_register(NULL, mm);
 		if (ret)
 			return ret;
-		mmn_mm = mm->mmu_notifier_mm;
+		subscriptions = mm->notifier_subscriptions;
 	}
-	return __mmu_interval_notifier_insert(mni, mm, mmn_mm, start, length,
-					      ops);
+	return __mmu_interval_notifier_insert(mni, mm, subscriptions, start,
+					      length, ops);
 }
 EXPORT_SYMBOL_GPL(mmu_interval_notifier_insert);
 
@@ -952,20 +964,20 @@ int mmu_interval_notifier_insert_locked(
 	unsigned long start, unsigned long length,
 	const struct mmu_interval_notifier_ops *ops)
 {
-	struct mmu_notifier_mm *mmn_mm;
+	struct mmu_notifier_subscriptions *subscriptions =
+		mm->notifier_subscriptions;
 	int ret;
 
 	lockdep_assert_held_write(&mm->mmap_sem);
 
-	mmn_mm = mm->mmu_notifier_mm;
-	if (!mmn_mm || !mmn_mm->has_itree) {
+	if (!subscriptions || !subscriptions->has_itree) {
 		ret = __mmu_notifier_register(NULL, mm);
 		if (ret)
 			return ret;
-		mmn_mm = mm->mmu_notifier_mm;
+		subscriptions = mm->notifier_subscriptions;
 	}
-	return __mmu_interval_notifier_insert(mni, mm, mmn_mm, start, length,
-					      ops);
+	return __mmu_interval_notifier_insert(mni, mm, subscriptions, start,
+					      length, ops);
 }
 EXPORT_SYMBOL_GPL(mmu_interval_notifier_insert_locked);
 
@@ -982,13 +994,14 @@ EXPORT_SYMBOL_GPL(mmu_interval_notifier_insert_locked);
 void mmu_interval_notifier_remove(struct mmu_interval_notifier *mni)
 {
 	struct mm_struct *mm = mni->mm;
-	struct mmu_notifier_mm *mmn_mm = mm->mmu_notifier_mm;
+	struct mmu_notifier_subscriptions *subscriptions =
+		mm->notifier_subscriptions;
 	unsigned long seq = 0;
 
 	might_sleep();
 
-	spin_lock(&mmn_mm->lock);
-	if (mn_itree_is_invalidating(mmn_mm)) {
+	spin_lock(&subscriptions->lock);
+	if (mn_itree_is_invalidating(subscriptions)) {
 		/*
 		 * remove is being called after insert put this on the
 		 * deferred list, but before the deferred list was processed.
@@ -997,14 +1010,15 @@ void mmu_interval_notifier_remove(struct mmu_interval_notifier *mni)
 			hlist_del(&mni->deferred_item);
 		} else {
 			hlist_add_head(&mni->deferred_item,
-				       &mmn_mm->deferred_list);
-			seq = mmn_mm->invalidate_seq;
+				       &subscriptions->deferred_list);
+			seq = subscriptions->invalidate_seq;
 		}
 	} else {
 		WARN_ON(RB_EMPTY_NODE(&mni->interval_tree.rb));
-		interval_tree_remove(&mni->interval_tree, &mmn_mm->itree);
+		interval_tree_remove(&mni->interval_tree,
+				     &subscriptions->itree);
 	}
-	spin_unlock(&mmn_mm->lock);
+	spin_unlock(&subscriptions->lock);
 
 	/*
 	 * The possible sleep on progress in the invalidation requires the
@@ -1013,8 +1027,8 @@ void mmu_interval_notifier_remove(struct mmu_interval_notifier *mni)
 	lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
 	lock_map_release(&__mmu_notifier_invalidate_range_start_map);
 	if (seq)
-		wait_event(mmn_mm->wq,
-			   READ_ONCE(mmn_mm->invalidate_seq) != seq);
+		wait_event(subscriptions->wq,
+			   READ_ONCE(subscriptions->invalidate_seq) != seq);
 
 	/* pairs with mmgrab in mmu_interval_notifier_insert() */
 	mmdrop(mm);
Linus Torvalds Dec. 18, 2019, 7:33 p.m. UTC | #13
On Wed, Dec 18, 2019 at 10:37 AM Jason Gunthorpe <jgg@mellanox.com> wrote:
>
> I think this is what you are looking for?

I think that with these names, I would have had an easier time reading
the original patch that made me go "Eww", yes.

Of course, now that it's just a rename patch, it's not like the patch
is all that legible, but yeah, I think the naming is saner.

              Linus