diff mbox series

[4/6] mm: add a gup_fixup_start_addr hook

Message ID 20190525133203.25853-5-hch@lst.de (mailing list archive)
State Superseded
Headers show
Series [1/6] MIPS: use the generic get_user_pages_fast code | expand

Commit Message

Christoph Hellwig May 25, 2019, 1:32 p.m. UTC
This will allow sparc64 to override its ADI tags for
get_user_pages and get_user_pages_fast.  I have no idea why this
is not required for plain old get_user_pages, but it keeps the
existing sparc64 behavior.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 mm/gup.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Linus Torvalds May 25, 2019, 5:05 p.m. UTC | #1
[ Adding Khalid, who added the sparc64 code ]

On Sat, May 25, 2019 at 6:32 AM Christoph Hellwig <hch@lst.de> wrote:
>
> This will allow sparc64 to override its ADI tags for
> get_user_pages and get_user_pages_fast.  I have no idea why this
> is not required for plain old get_user_pages, but it keeps the
> existing sparc64 behavior.

This is actually generic. ARM64 has tagged pointers too. Right now the
system call interfaces are all supposed to mask off the tags, but
there's been noise about having the kernel understand them.

That said:

> +#ifndef gup_fixup_start_addr
> +#define gup_fixup_start_addr(start)    (start)
> +#endif

I'd rather name this much more specifically (ie make it very much
about "clean up pointer tags") and I'm also not clear on why sparc64
actually wants this. I thought the sparc64 rules were the same as the
(current) arm64 rules: any addresses passed to the kernel have to be
the non-tagged ones.

As you say, nothing *else* in the kernel does that address cleanup,
why should get_user_pages_fast() do it?

David? Khalid? Why does sparc64 actually need this? It looks like the
generic get_user_pages() doesn't do it.

                Linus
Khalid Aziz May 28, 2019, 3:57 p.m. UTC | #2
On 5/25/19 11:05 AM, Linus Torvalds wrote:
> [ Adding Khalid, who added the sparc64 code ]
> 
> On Sat, May 25, 2019 at 6:32 AM Christoph Hellwig <hch@lst.de> wrote:
>>
>> This will allow sparc64 to override its ADI tags for
>> get_user_pages and get_user_pages_fast.  I have no idea why this
>> is not required for plain old get_user_pages, but it keeps the
>> existing sparc64 behavior.
> 
> This is actually generic. ARM64 has tagged pointers too. Right now the
> system call interfaces are all supposed to mask off the tags, but
> there's been noise about having the kernel understand them.
> 
> That said:
> 
>> +#ifndef gup_fixup_start_addr
>> +#define gup_fixup_start_addr(start)    (start)
>> +#endif
> 
> I'd rather name this much more specifically (ie make it very much
> about "clean up pointer tags") and I'm also not clear on why sparc64
> actually wants this. I thought the sparc64 rules were the same as the
> (current) arm64 rules: any addresses passed to the kernel have to be
> the non-tagged ones.
> 
> As you say, nothing *else* in the kernel does that address cleanup,
> why should get_user_pages_fast() do it?
> 
> David? Khalid? Why does sparc64 actually need this? It looks like the
> generic get_user_pages() doesn't do it.
> 


There is another discussion going on about tagged pointers on ARM64 and
intersection with sparc64 code. I agree there is a generic need to mask
off tags for kernel use now that ARM64 is also looking into supporting
memory tagging. The need comes from sparc64 not storing tagged address
in VMAs. It is not practical to store tagged addresses in VMAs because
manipulation of address tags is done entirely in userspace on sparc64.
Userspace is free to change tags on an address range at any time without
involving kernel and constantly rotating tags is actually a security
feature even. This makes it impractical for kernel to try to keep up
with constantly changing tagged addresses in VMAs. Untagged addresses in
VMAs means any find_vma() and brethren calls need to be passed an
untagged address.

On sparc64, my intent was to support address tagging for dynamically
allocated data buffers only (malloc, mmap and shm specifically) and not
for any generic system calls which limited the scope and amount of
untagging needed in the kernel. ARM64 is working to add transparent
tagged address support at C library level. Adding tagged addresses to C
library requires every possible call into kernel to either handle tagged
addresses or untag address at some point. Andrey found out it is not as
easy as untagging addresses in functions that search through vma.
Callers of find_vma() and others tend to do address arithmetic on the
address stored in vma that is returned. This requires a more complex
solution than just stripping tags in vma lookup routines.

Since untagging addresses is a generic need required for far more than
gup, I prefer the way Andrey wrote it -
<https://patchwork.kernel.org/patch/10923637/>

--
Khalid
Christoph Hellwig May 29, 2019, 7:26 a.m. UTC | #3
On Tue, May 28, 2019 at 09:57:25AM -0600, Khalid Aziz wrote:
> Since untagging addresses is a generic need required for far more than
> gup, I prefer the way Andrey wrote it -
> <https://patchwork.kernel.org/patch/10923637/>

Linus, what do you think of picking up that trivial prep patch for
5.2?  That way the arm64 and get_user_pages series can progress
independently for 5.3.
Catalin Marinas May 29, 2019, 8:19 a.m. UTC | #4
Hi Christoph,

On Sat, 25 May 2019 at 14:33, Christoph Hellwig <hch@lst.de> wrote:
> diff --git a/mm/gup.c b/mm/gup.c
> index f173fcbaf1b2..1c21ecfbf38b 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2117,6 +2117,10 @@ static void gup_pgd_range(unsigned long addr, unsigned long end,
>         } while (pgdp++, addr = next, addr != end);
>  }
>
> +#ifndef gup_fixup_start_addr
> +#define gup_fixup_start_addr(start)    (start)
> +#endif

As you pointed out in a subsequent reply, we could use the
untagged_addr() macro from Andrey (or a shorter "untag_addr" if you
want it to look like a verb).

>  #ifndef gup_fast_permitted
>  /*
>   * Check if it's allowed to use __get_user_pages_fast() for the range, or
> @@ -2145,7 +2149,7 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write,
>         unsigned long flags;
>         int nr = 0;
>
> -       start &= PAGE_MASK;
> +       start = gup_fixup_start_addr(start) & PAGE_MASK;
>         len = (unsigned long) nr_pages << PAGE_SHIFT;
>         end = start + len;
>
> @@ -2218,7 +2222,7 @@ int get_user_pages_fast(unsigned long start, int nr_pages,
>         unsigned long addr, len, end;
>         int nr = 0, ret = 0;
>
> -       start &= PAGE_MASK;
> +       start = gup_fixup_start_addr(start) & PAGE_MASK;
>         addr = start;
>         len = (unsigned long) nr_pages << PAGE_SHIFT;
>         end = start + len;

In Andrey's patch [1] we don't fix __get_user_pages_fast(), only
__get_user_pages() as it needs to do a find_vma() search. I wonder
whether this is actually necessary for the *_fast() versions. If the
top byte is non-zero (i.e. tagged address), 'end' would also have the
same tag. The page table macros like pgd_index() and pgd_addr_end()
already take care of masking out the top bits (at least for arm64)
since they need to work on kernel address with the top bits all 1. So
gup_pgd_range() should cope with tagged addresses already.

[1] https://lore.kernel.org/lkml/d234cd71774f35229bdfc0a793c34d6712b73093.1557160186.git.andreyknvl@google.com/
diff mbox series

Patch

diff --git a/mm/gup.c b/mm/gup.c
index f173fcbaf1b2..1c21ecfbf38b 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2117,6 +2117,10 @@  static void gup_pgd_range(unsigned long addr, unsigned long end,
 	} while (pgdp++, addr = next, addr != end);
 }
 
+#ifndef gup_fixup_start_addr
+#define gup_fixup_start_addr(start)	(start)
+#endif
+
 #ifndef gup_fast_permitted
 /*
  * Check if it's allowed to use __get_user_pages_fast() for the range, or
@@ -2145,7 +2149,7 @@  int __get_user_pages_fast(unsigned long start, int nr_pages, int write,
 	unsigned long flags;
 	int nr = 0;
 
-	start &= PAGE_MASK;
+	start = gup_fixup_start_addr(start) & PAGE_MASK;
 	len = (unsigned long) nr_pages << PAGE_SHIFT;
 	end = start + len;
 
@@ -2218,7 +2222,7 @@  int get_user_pages_fast(unsigned long start, int nr_pages,
 	unsigned long addr, len, end;
 	int nr = 0, ret = 0;
 
-	start &= PAGE_MASK;
+	start = gup_fixup_start_addr(start) & PAGE_MASK;
 	addr = start;
 	len = (unsigned long) nr_pages << PAGE_SHIFT;
 	end = start + len;