[RFC,v2,6/4] mm: disallow user copy to/from separately allocated pages
diff mbox

Message ID 20160610154403.1d906134@annuminas.surriel.com
State New
Headers show

Commit Message

Rik van Riel June 10, 2016, 7:44 p.m. UTC
v2 of yesterday's patch, this one seems to completely work on my
system, after taking _bdata & _edata into account.

I am still looking into CMA, but am leaning towards doing that as
a follow-up patch.

---8<---

Subject: mm: disallow user copy to/from separately allocated pages

A single copy_from_user or copy_to_user should go to or from a single
kernel object. Inside the slab, or on the stack, we can track the
individual objects.

For the general kernel heap, we do not know exactly where each object
is, but we can tell whether the whole range from ptr to ptr + n is
inside the same page, or inside the same compound page.

If the start and end of the "object" are in pages that were not allocated
together, we are likely dealing with an overflow from one object into
the next page, and should disallow this copy.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
v2:
 - also test against _bdata & _edata, this appears to be necessary for
   some kernfs/sysfs stuff
 - clean up the code a little bit

 mm/usercopy.c | 27 +++++++++++++++++++++++----
 1 file changed, 23 insertions(+), 4 deletions(-)

Comments

Kees Cook June 10, 2016, 8:46 p.m. UTC | #1
On Fri, Jun 10, 2016 at 12:44 PM, Rik van Riel <riel@redhat.com> wrote:
> v2 of yesterday's patch, this one seems to completely work on my
> system, after taking _bdata & _edata into account.
>
> I am still looking into CMA, but am leaning towards doing that as
> a follow-up patch.

Cool. It looks like this is slowly changing to more of a whitelist
than a blacklist, which I think would be a welcome improvement if it's
feasible. That way we don't need an explicit check for kernel text, it
would already be outside the allowed areas.

-Kees

>
> ---8<---
>
> Subject: mm: disallow user copy to/from separately allocated pages
>
> A single copy_from_user or copy_to_user should go to or from a single
> kernel object. Inside the slab, or on the stack, we can track the
> individual objects.
>
> For the general kernel heap, we do not know exactly where each object
> is, but we can tell whether the whole range from ptr to ptr + n is
> inside the same page, or inside the same compound page.
>
> If the start and end of the "object" are in pages that were not allocated
> together, we are likely dealing with an overflow from one object into
> the next page, and should disallow this copy.
>
> Signed-off-by: Rik van Riel <riel@redhat.com>
> ---
> v2:
>  - also test against _bdata & _edata, this appears to be necessary for
>    some kernfs/sysfs stuff
>  - clean up the code a little bit
>
>  mm/usercopy.c | 27 +++++++++++++++++++++++----
>  1 file changed, 23 insertions(+), 4 deletions(-)
>
> diff --git a/mm/usercopy.c b/mm/usercopy.c
> index e09c33070759..78869ea73194 100644
> --- a/mm/usercopy.c
> +++ b/mm/usercopy.c
> @@ -109,7 +109,8 @@ static inline bool check_kernel_text_object(const void *ptr, unsigned long n)
>
>  static inline const char *check_heap_object(const void *ptr, unsigned long n)
>  {
> -       struct page *page;
> +       struct page *page, *endpage;
> +       const void *end = ptr + n - 1;
>
>         if (ZERO_OR_NULL_PTR(ptr))
>                 return "<null>";
> @@ -118,11 +119,29 @@ static inline const char *check_heap_object(const void *ptr, unsigned long n)
>                 return NULL;
>
>         page = virt_to_head_page(ptr);
> -       if (!PageSlab(page))
> +       if (PageSlab(page))
> +               /* Check allocator for flags and size. */
> +               return __check_heap_object(ptr, n, page);
> +
> +       /* Is the object wholly within one base page? */
> +       if (likely(((unsigned long)ptr & (unsigned long)PAGE_MASK) ==
> +                  ((unsigned long)end & (unsigned long)PAGE_MASK)))
> +               return NULL;
> +
> +       /* Are the start and end inside the same compound page? */
> +       endpage = virt_to_head_page(end);
> +       if (likely(endpage == page))
> +               return NULL;
> +
> +       /* Is this a special area, eg. .rodata, .bss, or device memory? */
> +       if (ptr >= (const void *)_sdata && end <= (const void *)_edata)
> +               return NULL;
> +
> +       if (PageReserved(page) && PageReserved(endpage))
>                 return NULL;
>
> -       /* Check allocator for flags and size. */
> -       return __check_heap_object(ptr, n, page);
> +       /* Uh oh. The "object" spans several independently allocated pages. */
> +       return "<spans multiple pages>";
>  }
>
>  /*
Kees Cook June 24, 2016, 8:53 p.m. UTC | #2
On Fri, Jun 10, 2016 at 12:44 PM, Rik van Riel <riel@redhat.com> wrote:
> v2 of yesterday's patch, this one seems to completely work on my
> system, after taking _bdata & _edata into account.
>
> I am still looking into CMA, but am leaning towards doing that as
> a follow-up patch.
>
> ---8<---
>
> Subject: mm: disallow user copy to/from separately allocated pages
>
> A single copy_from_user or copy_to_user should go to or from a single
> kernel object. Inside the slab, or on the stack, we can track the
> individual objects.
>
> For the general kernel heap, we do not know exactly where each object
> is, but we can tell whether the whole range from ptr to ptr + n is
> inside the same page, or inside the same compound page.
>
> If the start and end of the "object" are in pages that were not allocated
> together, we are likely dealing with an overflow from one object into
> the next page, and should disallow this copy.
>
> Signed-off-by: Rik van Riel <riel@redhat.com>
> ---
> v2:
>  - also test against _bdata & _edata, this appears to be necessary for
>    some kernfs/sysfs stuff
>  - clean up the code a little bit
>
>  mm/usercopy.c | 27 +++++++++++++++++++++++----
>  1 file changed, 23 insertions(+), 4 deletions(-)
>
> diff --git a/mm/usercopy.c b/mm/usercopy.c
> index e09c33070759..78869ea73194 100644
> --- a/mm/usercopy.c
> +++ b/mm/usercopy.c
> @@ -109,7 +109,8 @@ static inline bool check_kernel_text_object(const void *ptr, unsigned long n)
>
>  static inline const char *check_heap_object(const void *ptr, unsigned long n)
>  {
> -       struct page *page;
> +       struct page *page, *endpage;
> +       const void *end = ptr + n - 1;
>
>         if (ZERO_OR_NULL_PTR(ptr))
>                 return "<null>";
> @@ -118,11 +119,29 @@ static inline const char *check_heap_object(const void *ptr, unsigned long n)
>                 return NULL;
>
>         page = virt_to_head_page(ptr);
> -       if (!PageSlab(page))
> +       if (PageSlab(page))
> +               /* Check allocator for flags and size. */
> +               return __check_heap_object(ptr, n, page);
> +
> +       /* Is the object wholly within one base page? */
> +       if (likely(((unsigned long)ptr & (unsigned long)PAGE_MASK) ==
> +                  ((unsigned long)end & (unsigned long)PAGE_MASK)))
> +               return NULL;
> +
> +       /* Are the start and end inside the same compound page? */
> +       endpage = virt_to_head_page(end);
> +       if (likely(endpage == page))
> +               return NULL;
> +
> +       /* Is this a special area, eg. .rodata, .bss, or device memory? */
> +       if (ptr >= (const void *)_sdata && end <= (const void *)_edata)
> +               return NULL;
> +
> +       if (PageReserved(page) && PageReserved(endpage))
>                 return NULL;

Shouldn't PageReserved cover the .data, .rodata, and .bss areas
already? Is the concern for the check being added here that a copy
might span multiple pages and that they're not allocated together when
laying out the kernel data regions?

-Kees

>
> -       /* Check allocator for flags and size. */
> -       return __check_heap_object(ptr, n, page);
> +       /* Uh oh. The "object" spans several independently allocated pages. */
> +       return "<spans multiple pages>";
>  }
>
>  /*
Rik van Riel June 24, 2016, 8:57 p.m. UTC | #3
On Fri, 2016-06-24 at 13:53 -0700, Kees Cook wrote:
> On Fri, Jun 10, 2016 at 12:44 PM, Rik van Riel <riel@redhat.com>
> wrote:
> > 
> > v2 of yesterday's patch, this one seems to completely work on my
> > system, after taking _bdata & _edata into account.
> > 
> > I am still looking into CMA, but am leaning towards doing that as
> > a follow-up patch.
> > 
> > ---8<---
> > 
> > Subject: mm: disallow user copy to/from separately allocated pages
> > 
> > A single copy_from_user or copy_to_user should go to or from a
> > single
> > kernel object. Inside the slab, or on the stack, we can track the
> > individual objects.
> > 
> > For the general kernel heap, we do not know exactly where each
> > object
> > is, but we can tell whether the whole range from ptr to ptr + n is
> > inside the same page, or inside the same compound page.
> > 
> > If the start and end of the "object" are in pages that were not
> > allocated
> > together, we are likely dealing with an overflow from one object
> > into
> > the next page, and should disallow this copy.
> > 
> > Signed-off-by: Rik van Riel <riel@redhat.com>
> > ---
> > v2:
> >  - also test against _bdata & _edata, this appears to be necessary
> > for
> >    some kernfs/sysfs stuff
> >  - clean up the code a little bit
> > 
> >  mm/usercopy.c | 27 +++++++++++++++++++++++----
> >  1 file changed, 23 insertions(+), 4 deletions(-)
> > 
> > diff --git a/mm/usercopy.c b/mm/usercopy.c
> > index e09c33070759..78869ea73194 100644
> > --- a/mm/usercopy.c
> > +++ b/mm/usercopy.c
> > @@ -109,7 +109,8 @@ static inline bool
> > check_kernel_text_object(const void *ptr, unsigned long n)
> > 
> >  static inline const char *check_heap_object(const void *ptr,
> > unsigned long n)
> >  {
> > -       struct page *page;
> > +       struct page *page, *endpage;
> > +       const void *end = ptr + n - 1;
> > 
> >         if (ZERO_OR_NULL_PTR(ptr))
> >                 return "<null>";
> > @@ -118,11 +119,29 @@ static inline const char
> > *check_heap_object(const void *ptr, unsigned long n)
> >                 return NULL;
> > 
> >         page = virt_to_head_page(ptr);
> > -       if (!PageSlab(page))
> > +       if (PageSlab(page))
> > +               /* Check allocator for flags and size. */
> > +               return __check_heap_object(ptr, n, page);
> > +
> > +       /* Is the object wholly within one base page? */
> > +       if (likely(((unsigned long)ptr & (unsigned long)PAGE_MASK)
> > ==
> > +                  ((unsigned long)end & (unsigned
> > long)PAGE_MASK)))
> > +               return NULL;
> > +
> > +       /* Are the start and end inside the same compound page? */
> > +       endpage = virt_to_head_page(end);
> > +       if (likely(endpage == page))
> > +               return NULL;
> > +
> > +       /* Is this a special area, eg. .rodata, .bss, or device
> > memory? */
> > +       if (ptr >= (const void *)_sdata && end <= (const void
> > *)_edata)
> > +               return NULL;
> > +
> > +       if (PageReserved(page) && PageReserved(endpage))
> >                 return NULL;
> Shouldn't PageReserved cover the .data, .rodata, and .bss areas
> already? Is the concern for the check being added here that a copy
> might span multiple pages and that they're not allocated together
> when
> laying out the kernel data regions?

Having just the PageReserved check was not enough, I had
to specifically add the _sdata & _edata test to get things
to work.

It looks like PageReserved is simply not set in some cases,
perhaps we are mapping those kernel sections with huge pages,
but not setting up the compound page stuff for the backing
pages, since they never pass through the buddy allocator?
Kees Cook June 24, 2016, 8:59 p.m. UTC | #4
On Fri, Jun 24, 2016 at 1:57 PM, Rik van Riel <riel@redhat.com> wrote:
> On Fri, 2016-06-24 at 13:53 -0700, Kees Cook wrote:
>> On Fri, Jun 10, 2016 at 12:44 PM, Rik van Riel <riel@redhat.com>
>> wrote:
>> >
>> > v2 of yesterday's patch, this one seems to completely work on my
>> > system, after taking _bdata & _edata into account.
>> >
>> > I am still looking into CMA, but am leaning towards doing that as
>> > a follow-up patch.
>> >
>> > ---8<---
>> >
>> > Subject: mm: disallow user copy to/from separately allocated pages
>> >
>> > A single copy_from_user or copy_to_user should go to or from a
>> > single
>> > kernel object. Inside the slab, or on the stack, we can track the
>> > individual objects.
>> >
>> > For the general kernel heap, we do not know exactly where each
>> > object
>> > is, but we can tell whether the whole range from ptr to ptr + n is
>> > inside the same page, or inside the same compound page.
>> >
>> > If the start and end of the "object" are in pages that were not
>> > allocated
>> > together, we are likely dealing with an overflow from one object
>> > into
>> > the next page, and should disallow this copy.
>> >
>> > Signed-off-by: Rik van Riel <riel@redhat.com>
>> > ---
>> > v2:
>> >  - also test against _bdata & _edata, this appears to be necessary
>> > for
>> >    some kernfs/sysfs stuff
>> >  - clean up the code a little bit
>> >
>> >  mm/usercopy.c | 27 +++++++++++++++++++++++----
>> >  1 file changed, 23 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/mm/usercopy.c b/mm/usercopy.c
>> > index e09c33070759..78869ea73194 100644
>> > --- a/mm/usercopy.c
>> > +++ b/mm/usercopy.c
>> > @@ -109,7 +109,8 @@ static inline bool
>> > check_kernel_text_object(const void *ptr, unsigned long n)
>> >
>> >  static inline const char *check_heap_object(const void *ptr,
>> > unsigned long n)
>> >  {
>> > -       struct page *page;
>> > +       struct page *page, *endpage;
>> > +       const void *end = ptr + n - 1;
>> >
>> >         if (ZERO_OR_NULL_PTR(ptr))
>> >                 return "<null>";
>> > @@ -118,11 +119,29 @@ static inline const char
>> > *check_heap_object(const void *ptr, unsigned long n)
>> >                 return NULL;
>> >
>> >         page = virt_to_head_page(ptr);
>> > -       if (!PageSlab(page))
>> > +       if (PageSlab(page))
>> > +               /* Check allocator for flags and size. */
>> > +               return __check_heap_object(ptr, n, page);
>> > +
>> > +       /* Is the object wholly within one base page? */
>> > +       if (likely(((unsigned long)ptr & (unsigned long)PAGE_MASK)
>> > ==
>> > +                  ((unsigned long)end & (unsigned
>> > long)PAGE_MASK)))
>> > +               return NULL;
>> > +
>> > +       /* Are the start and end inside the same compound page? */
>> > +       endpage = virt_to_head_page(end);
>> > +       if (likely(endpage == page))
>> > +               return NULL;
>> > +
>> > +       /* Is this a special area, eg. .rodata, .bss, or device
>> > memory? */
>> > +       if (ptr >= (const void *)_sdata && end <= (const void
>> > *)_edata)
>> > +               return NULL;
>> > +
>> > +       if (PageReserved(page) && PageReserved(endpage))
>> >                 return NULL;
>> Shouldn't PageReserved cover the .data, .rodata, and .bss areas
>> already? Is the concern for the check being added here that a copy
>> might span multiple pages and that they're not allocated together
>> when
>> laying out the kernel data regions?
>
> Having just the PageReserved check was not enough, I had
> to specifically add the _sdata & _edata test to get things
> to work.
>
> It looks like PageReserved is simply not set in some cases,
> perhaps we are mapping those kernel sections with huge pages,
> but not setting up the compound page stuff for the backing
> pages, since they never pass through the buddy allocator?

Ah-ha, gotcha. Thanks!

-Kees

Patch
diff mbox

diff --git a/mm/usercopy.c b/mm/usercopy.c
index e09c33070759..78869ea73194 100644
--- a/mm/usercopy.c
+++ b/mm/usercopy.c
@@ -109,7 +109,8 @@  static inline bool check_kernel_text_object(const void *ptr, unsigned long n)
 
 static inline const char *check_heap_object(const void *ptr, unsigned long n)
 {
-	struct page *page;
+	struct page *page, *endpage;
+	const void *end = ptr + n - 1;
 
 	if (ZERO_OR_NULL_PTR(ptr))
 		return "<null>";
@@ -118,11 +119,29 @@  static inline const char *check_heap_object(const void *ptr, unsigned long n)
 		return NULL;
 
 	page = virt_to_head_page(ptr);
-	if (!PageSlab(page))
+	if (PageSlab(page))
+		/* Check allocator for flags and size. */
+		return __check_heap_object(ptr, n, page);
+
+	/* Is the object wholly within one base page? */
+	if (likely(((unsigned long)ptr & (unsigned long)PAGE_MASK) ==
+		   ((unsigned long)end & (unsigned long)PAGE_MASK)))
+		return NULL;
+
+	/* Are the start and end inside the same compound page? */
+	endpage = virt_to_head_page(end);
+	if (likely(endpage == page))
+		return NULL;
+
+	/* Is this a special area, eg. .rodata, .bss, or device memory? */
+	if (ptr >= (const void *)_sdata && end <= (const void *)_edata)
+		return NULL;
+
+	if (PageReserved(page) && PageReserved(endpage))
 		return NULL;
 
-	/* Check allocator for flags and size. */
-	return __check_heap_object(ptr, n, page);
+	/* Uh oh. The "object" spans several independently allocated pages. */
+	return "<spans multiple pages>";
 }
 
 /*