diff mbox series

[RFC,4/8] mm/madvise: define madvise behavior in a struct

Message ID 20210926161259.238054-5-namit@vmware.com (mailing list archive)
State New
Headers show
Series mm/madvise: support process_madvise(MADV_DONTNEED) | expand

Commit Message

Nadav Amit Sept. 26, 2021, 4:12 p.m. UTC
From: Nadav Amit <namit@vmware.com>

The different behaviors of madvise are different in several ways, which
are distributed across several functions. Use the design pattern from
iouring in order to define the actions that are required for each
behavior.

The next patches will get rid of old helper functions that are modified
in this patch and the redundant use of array_index_nospec(). The next
patches will add more actions for each leaf into the new struct.

No functional change is intended.

Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Colin Cross <ccross@google.com>
Cc: Suren Baghdasarya <surenb@google.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
---
 mm/madvise.c | 168 +++++++++++++++++++++++++++++++++------------------
 1 file changed, 109 insertions(+), 59 deletions(-)

Comments

Kirill A. Shutemov Sept. 27, 2021, 9:31 a.m. UTC | #1
On Sun, Sep 26, 2021 at 09:12:55AM -0700, Nadav Amit wrote:
> From: Nadav Amit <namit@vmware.com>
> 
> The different behaviors of madvise are different in several ways, which
> are distributed across several functions. Use the design pattern from
> iouring in order to define the actions that are required for each
> behavior.
> 
> The next patches will get rid of old helper functions that are modified
> in this patch and the redundant use of array_index_nospec(). The next
> patches will add more actions for each leaf into the new struct.
> 
> No functional change is intended.
> 
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Colin Cross <ccross@google.com>
> Cc: Suren Baghdasarya <surenb@google.com>
> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
> Signed-off-by: Nadav Amit <namit@vmware.com>
> ---
>  mm/madvise.c | 168 +++++++++++++++++++++++++++++++++------------------
>  1 file changed, 109 insertions(+), 59 deletions(-)
> 
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 17e39c70704b..127507c71ba9 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -29,6 +29,7 @@
>  #include <linux/swapops.h>
>  #include <linux/shmem_fs.h>
>  #include <linux/mmu_notifier.h>
> +#include <linux/nospec.h>
>  
>  #include <asm/tlb.h>
>  
> @@ -39,6 +40,101 @@ struct madvise_walk_private {
>  	bool pageout;
>  };
>  
> +struct madvise_info {
> +	u8 behavior_valid: 1;
> +	u8 process_behavior_valid: 1;
> +	u8 need_mmap_read_only: 1;
> +};
> +
> +static const struct madvise_info madvise_info[MADV_SOFT_OFFLINE+1] = {

MADV_SOFT_OFFLINE+1 smells bad.

And I don't like the change in general. Given that MADV_SOFT_OFFLINE is
101, the array will be mostly empty.

I donno. I don't see any improvement with the patch. But maybe it's only me.
Nadav Amit Sept. 27, 2021, 10:31 a.m. UTC | #2
> On Sep 27, 2021, at 2:31 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> 
> On Sun, Sep 26, 2021 at 09:12:55AM -0700, Nadav Amit wrote:
>> From: Nadav Amit <namit@vmware.com>
>> 
>> The different behaviors of madvise are different in several ways, which
>> are distributed across several functions. Use the design pattern from
>> iouring in order to define the actions that are required for each
>> behavior.
>> 
>> The next patches will get rid of old helper functions that are modified
>> in this patch and the redundant use of array_index_nospec(). The next
>> patches will add more actions for each leaf into the new struct.
>> 
>> No functional change is intended.
>> 
>> Cc: Andrea Arcangeli <aarcange@redhat.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Colin Cross <ccross@google.com>
>> Cc: Suren Baghdasarya <surenb@google.com>
>> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
>> Signed-off-by: Nadav Amit <namit@vmware.com>
>> ---
>> mm/madvise.c | 168 +++++++++++++++++++++++++++++++++------------------
>> 1 file changed, 109 insertions(+), 59 deletions(-)
>> 
>> diff --git a/mm/madvise.c b/mm/madvise.c
>> index 17e39c70704b..127507c71ba9 100644
>> --- a/mm/madvise.c
>> +++ b/mm/madvise.c
>> @@ -29,6 +29,7 @@
>> #include <linux/swapops.h>
>> #include <linux/shmem_fs.h>
>> #include <linux/mmu_notifier.h>
>> +#include <linux/nospec.h>
>> 
>> #include <asm/tlb.h>
>> 
>> @@ -39,6 +40,101 @@ struct madvise_walk_private {
>> 	bool pageout;
>> };
>> 
>> +struct madvise_info {
>> +	u8 behavior_valid: 1;
>> +	u8 process_behavior_valid: 1;
>> +	u8 need_mmap_read_only: 1;
>> +};
>> +
>> +static const struct madvise_info madvise_info[MADV_SOFT_OFFLINE+1] = {
> 
> MADV_SOFT_OFFLINE+1 smells bad.

I can set another constant instead and let the compiler shout if anything
outside the array is initialized.

> 
> And I don't like the change in general. Given that MADV_SOFT_OFFLINE is
> 101, the array will be mostly empty.

Seriously, these is less than 128B - two cachelines. Perhaps they should
be aligned. But this whole change should have no effect on code/data size.

> 
> I donno. I don't see any improvement with the patch. But maybe it's only me.

The following patches make it clearer when TLBs flushes are batched and
when mmap_lock is not taken (which is by the way not clear from the code).

I could have added two more functions for that and it would have taken
me less time. I do not think the end result of having ~5 different
functions to figure out the actions needed for each behavior would be
as clear.
Kirill A. Shutemov Sept. 27, 2021, 12:14 p.m. UTC | #3
On Mon, Sep 27, 2021 at 03:31:21AM -0700, Nadav Amit wrote:
> 
> 
> > On Sep 27, 2021, at 2:31 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> > 
> > On Sun, Sep 26, 2021 at 09:12:55AM -0700, Nadav Amit wrote:
> >> From: Nadav Amit <namit@vmware.com>
> >> 
> >> The different behaviors of madvise are different in several ways, which
> >> are distributed across several functions. Use the design pattern from
> >> iouring in order to define the actions that are required for each
> >> behavior.
> >> 
> >> The next patches will get rid of old helper functions that are modified
> >> in this patch and the redundant use of array_index_nospec(). The next
> >> patches will add more actions for each leaf into the new struct.
> >> 
> >> No functional change is intended.
> >> 
> >> Cc: Andrea Arcangeli <aarcange@redhat.com>
> >> Cc: Andrew Morton <akpm@linux-foundation.org>
> >> Cc: Minchan Kim <minchan@kernel.org>
> >> Cc: Colin Cross <ccross@google.com>
> >> Cc: Suren Baghdasarya <surenb@google.com>
> >> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
> >> Signed-off-by: Nadav Amit <namit@vmware.com>
> >> ---
> >> mm/madvise.c | 168 +++++++++++++++++++++++++++++++++------------------
> >> 1 file changed, 109 insertions(+), 59 deletions(-)
> >> 
> >> diff --git a/mm/madvise.c b/mm/madvise.c
> >> index 17e39c70704b..127507c71ba9 100644
> >> --- a/mm/madvise.c
> >> +++ b/mm/madvise.c
> >> @@ -29,6 +29,7 @@
> >> #include <linux/swapops.h>
> >> #include <linux/shmem_fs.h>
> >> #include <linux/mmu_notifier.h>
> >> +#include <linux/nospec.h>
> >> 
> >> #include <asm/tlb.h>
> >> 
> >> @@ -39,6 +40,101 @@ struct madvise_walk_private {
> >> 	bool pageout;
> >> };
> >> 
> >> +struct madvise_info {
> >> +	u8 behavior_valid: 1;
> >> +	u8 process_behavior_valid: 1;
> >> +	u8 need_mmap_read_only: 1;
> >> +};
> >> +
> >> +static const struct madvise_info madvise_info[MADV_SOFT_OFFLINE+1] = {
> > 
> > MADV_SOFT_OFFLINE+1 smells bad.
> 
> I can set another constant instead and let the compiler shout if anything
> outside the array is initialized.

I would rather introduce a function that would return struct madvise_info
for a given behavior. The function would have a switch inside. The default:
may have BUILD_BUG() or something.
Nadav Amit Sept. 27, 2021, 8:36 p.m. UTC | #4
> On Sep 27, 2021, at 5:14 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> 
> On Mon, Sep 27, 2021 at 03:31:21AM -0700, Nadav Amit wrote:
>> 
>> 
>>> On Sep 27, 2021, at 2:31 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
>>> 
>>> On Sun, Sep 26, 2021 at 09:12:55AM -0700, Nadav Amit wrote:
>>>> From: Nadav Amit <namit@vmware.com>
>>>> 
>>>> The different behaviors of madvise are different in several ways, which
>>>> are distributed across several functions. Use the design pattern from
>>>> iouring in order to define the actions that are required for each
>>>> behavior.
>>>> 
>>>> The next patches will get rid of old helper functions that are modified
>>>> in this patch and the redundant use of array_index_nospec(). The next
>>>> patches will add more actions for each leaf into the new struct.
>>> 

[ snip ]

>>> MADV_SOFT_OFFLINE+1 smells bad.
>> 
>> I can set another constant instead and let the compiler shout if anything
>> outside the array is initialized.
> 
> I would rather introduce a function that would return struct madvise_info
> for a given behavior. The function would have a switch inside. The default:
> may have BUILD_BUG() or something.

Sounds better than my solution. I will do so.
diff mbox series

Patch

diff --git a/mm/madvise.c b/mm/madvise.c
index 17e39c70704b..127507c71ba9 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -29,6 +29,7 @@ 
 #include <linux/swapops.h>
 #include <linux/shmem_fs.h>
 #include <linux/mmu_notifier.h>
+#include <linux/nospec.h>
 
 #include <asm/tlb.h>
 
@@ -39,6 +40,101 @@  struct madvise_walk_private {
 	bool pageout;
 };
 
+struct madvise_info {
+	u8 behavior_valid: 1;
+	u8 process_behavior_valid: 1;
+	u8 need_mmap_read_only: 1;
+};
+
+static const struct madvise_info madvise_info[MADV_SOFT_OFFLINE+1] = {
+	[MADV_DOFORK] = {
+		.behavior_valid = 1,
+	},
+	[MADV_DONTFORK] = {
+		.behavior_valid = 1,
+	},
+	[MADV_NORMAL] = {
+		.behavior_valid = 1,
+	},
+	[MADV_SEQUENTIAL] = {
+		.behavior_valid = 1,
+	},
+	[MADV_RANDOM] = {
+		.behavior_valid = 1,
+	},
+	[MADV_REMOVE] = {
+		.behavior_valid = 1,
+		.need_mmap_read_only = 1,
+	},
+	[MADV_WILLNEED] = {
+		.behavior_valid = 1,
+		.process_behavior_valid = 1,
+		.need_mmap_read_only = 1,
+	},
+	[MADV_DONTNEED] = {
+		.behavior_valid = 1,
+		.need_mmap_read_only = 1,
+	},
+	[MADV_FREE] = {
+		.behavior_valid = 1,
+		.need_mmap_read_only = 1,
+	},
+	[MADV_COLD] = {
+		.behavior_valid = 1,
+		.process_behavior_valid = 1,
+		.need_mmap_read_only = 1,
+	},
+	[MADV_PAGEOUT] = {
+		.behavior_valid = 1,
+		.process_behavior_valid = 1,
+		.need_mmap_read_only = 1,
+	},
+#ifdef CONFIG_KSM
+	[MADV_MERGEABLE] = {
+		.behavior_valid = 1,
+	},
+	[MADV_UNMERGEABLE] = {
+		.behavior_valid = 1,
+	},
+#endif
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+	[MADV_HUGEPAGE] = {
+		.behavior_valid = 1,
+	},
+	[MADV_NOHUGEPAGE] = {
+		.behavior_valid = 1,
+	},
+#endif
+	[MADV_DONTDUMP] = {
+		.behavior_valid = 1,
+	},
+	[MADV_DODUMP] = {
+		.behavior_valid = 1,
+	},
+	[MADV_WIPEONFORK] = {
+		.behavior_valid = 1,
+	},
+	[MADV_KEEPONFORK] = {
+		.behavior_valid = 1,
+	},
+#ifdef CONFIG_MEMORY_FAILURE
+	[MADV_HWPOISON] = {
+		.behavior_valid = 1,
+	},
+	[MADV_SOFT_OFFLINE] = {
+		.behavior_valid = 1,
+	},
+#endif
+	[MADV_POPULATE_READ] = {
+		.behavior_valid = 1,
+		.need_mmap_read_only = 1,
+	},
+	[MADV_POPULATE_WRITE] = {
+		.behavior_valid = 1,
+		.need_mmap_read_only = 1,
+	},
+};
+
 /*
  * Any behaviour which results in changes to the vma->vm_flags needs to
  * take mmap_lock for writing. Others, which simply traverse vmas, need
@@ -46,20 +142,7 @@  struct madvise_walk_private {
  */
 static int madvise_need_mmap_write(int behavior)
 {
-	switch (behavior) {
-	case MADV_REMOVE:
-	case MADV_WILLNEED:
-	case MADV_DONTNEED:
-	case MADV_COLD:
-	case MADV_PAGEOUT:
-	case MADV_FREE:
-	case MADV_POPULATE_READ:
-	case MADV_POPULATE_WRITE:
-		return 0;
-	default:
-		/* be safe, default to 1. list exceptions explicitly */
-		return 1;
-	}
+	return !madvise_info[behavior].need_mmap_read_only;
 }
 
 /*
@@ -999,56 +1082,23 @@  madvise_vma(struct vm_area_struct *vma, struct vm_area_struct **prev,
 }
 
 static bool
-madvise_behavior_valid(int behavior)
+madvise_behavior_valid(int *behavior)
 {
-	switch (behavior) {
-	case MADV_DOFORK:
-	case MADV_DONTFORK:
-	case MADV_NORMAL:
-	case MADV_SEQUENTIAL:
-	case MADV_RANDOM:
-	case MADV_REMOVE:
-	case MADV_WILLNEED:
-	case MADV_DONTNEED:
-	case MADV_FREE:
-	case MADV_COLD:
-	case MADV_PAGEOUT:
-	case MADV_POPULATE_READ:
-	case MADV_POPULATE_WRITE:
-#ifdef CONFIG_KSM
-	case MADV_MERGEABLE:
-	case MADV_UNMERGEABLE:
-#endif
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
-	case MADV_HUGEPAGE:
-	case MADV_NOHUGEPAGE:
-#endif
-	case MADV_DONTDUMP:
-	case MADV_DODUMP:
-	case MADV_WIPEONFORK:
-	case MADV_KEEPONFORK:
-#ifdef CONFIG_MEMORY_FAILURE
-	case MADV_SOFT_OFFLINE:
-	case MADV_HWPOISON:
-#endif
-		return true;
-
-	default:
+	if (*behavior >= ARRAY_SIZE(madvise_info))
 		return false;
-	}
+
+	*behavior = array_index_nospec(*behavior, ARRAY_SIZE(madvise_info));
+	return madvise_info[*behavior].behavior_valid;
 }
 
 static bool
-process_madvise_behavior_valid(int behavior)
+process_madvise_behavior_valid(int *behavior)
 {
-	switch (behavior) {
-	case MADV_COLD:
-	case MADV_PAGEOUT:
-	case MADV_WILLNEED:
-		return true;
-	default:
+	if (*behavior >= ARRAY_SIZE(madvise_info))
 		return false;
-	}
+
+	*behavior = array_index_nospec(*behavior, ARRAY_SIZE(madvise_info));
+	return madvise_info[*behavior].process_behavior_valid;
 }
 
 /*
@@ -1133,7 +1183,7 @@  int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int beh
 
 	start = untagged_addr(start);
 
-	if (!madvise_behavior_valid(behavior))
+	if (!madvise_behavior_valid(&behavior))
 		return error;
 
 	if (!PAGE_ALIGNED(start))
@@ -1258,7 +1308,7 @@  SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
 		goto put_pid;
 	}
 
-	if (!process_madvise_behavior_valid(behavior)) {
+	if (!process_madvise_behavior_valid(&behavior)) {
 		ret = -EINVAL;
 		goto release_task;
 	}