diff mbox

[v2,for-4.10] x86/mm: Make PV linear pagetables optional

Message ID 20171018105159.9582-1-george.dunlap@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

George Dunlap Oct. 18, 2017, 10:51 a.m. UTC
Allowing pagetables to point to other pagetables of the same level
(often called 'linear pagetables') has been included in Xen since its
inception; but recently it has been the source of a number of subtle
reference-counting bugs.

It is not used by Linux or MiniOS; but it used used by NetBSD and
Novell Netware.  There are significant numbers of people who are never
going to use the feature, along with significant numbers who need the
feature.

Add a Kconfig option for the feature (default to 'y').  Also add a
command-line option to control whether PV linear pagetables are
allowed (default to 'true').

NB that we leave linear_pt_count in the page struct.  It's in a union,
so its presence doesn't increase the size of the data struct.
Changing the layout of the other elements based on configuration
options is asking for trouble however; so we'll just leave it there
and ASSERT that it's zero.

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
Changes since v1
- Remove stray blank lines added from previous patch
- Leave pg->linear_pt_count present, assert it's 0
- Rename variable to opt_pv_linear_pt
- Add spaces around #ifdef/#else/#endif for large code block
- Add /* CONFIG_LINEAR_PV_PT */ after #else/#endif for clarity
- Correct documented default value
- Mention in documentation that the option is only available if configured
- Move config option to below PV (for the day when we make that selectable)

Changes since XSA
- Add a Kconfig option
- Default to 'on' (rather than 'off').

Release justification: This was originally part of a security fix
embargoed until after the freeze date; it wasn't checked in with the
other security patches in order to allow a discussion about the
default.

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Konrad Wilk <konrad.wilk@oracle.com>
CC: Julien Grall <julien.grall@arm.com>
---
 docs/misc/xen-command-line.markdown | 19 +++++++++++++++++++
 xen/arch/x86/Kconfig                | 20 ++++++++++++++++++++
 xen/arch/x86/mm.c                   | 37 +++++++++++++++++++++++++++++++++++++
 3 files changed, 76 insertions(+)

Comments

Wei Liu Oct. 18, 2017, 11:33 a.m. UTC | #1
On Wed, Oct 18, 2017 at 11:51:59AM +0100, George Dunlap wrote:
> Allowing pagetables to point to other pagetables of the same level
> (often called 'linear pagetables') has been included in Xen since its
> inception; but recently it has been the source of a number of subtle
> reference-counting bugs.
> 
> It is not used by Linux or MiniOS; but it used used by NetBSD and

"it used used" here.
Jan Beulich Oct. 18, 2017, 1:41 p.m. UTC | #2
>>> On 18.10.17 at 12:51, <george.dunlap@citrix.com> wrote:
> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -37,6 +37,26 @@ source "arch/Kconfig"
>  config PV
>  	def_bool y
>  
> +config PV_LINEAR_PT
> +       bool "Support for PV linear pagetables"
> +       depends on PV
> +       default y
> +       ---help---
> +         Linear pagetables (also called "recursive pagetables") refers
> +         to the practice of a guest operating system having pagetable
> +         entries pointing to other pagetables of the same level (i.e.,
> +         allowing L2 PTEs to point to other L2 pages).  Some operating
> +         systems use it as a simple way to consisently map the current
> +         process's pagetables into its own virtual address space.
> +
> +         Linux and MiniOS don't use this technique.  NetBSD and Novell
> +         Netware do; there may be other custom operating systems which
> +         do.  If you're certain you don't plan on having PV guests
> +         which use this feature, turning it off can reduce the attack
> +         surface.
> +
> +         If unsure, say Y.
> +
>  config HVM
>  	def_bool y

Note how the options in context use tab indentation. Granted
there are other examples of space indentation in this file, but
at least they're using 8 spaces (except of course of the help
text), while you're using 7.

> @@ -2320,6 +2353,7 @@ static int _put_page_type(struct page_info *page, bool preemptible,
>                  break;
>              }
>  
> +#ifdef CONFIG_PV_LINEAR_PT
>              if ( ptpg && PGT_type_equal(x, ptpg->u.inuse.type_info) )
>              {
>                  /*
> @@ -2334,6 +2368,9 @@ static int _put_page_type(struct page_info *page, bool preemptible,
>                  ASSERT(ptpg->linear_pt_count > 0);
>                  ptpg = NULL;
>              }
> +#else /* CONFIG_PV_LINEAR_PT */
> +            BUG_ON(ptpg && PGT_type_equal(x, ptpg->u.inuse.type_info));
> +#endif

Along the lines of my most recent reply to v1 (which I realize I
did send only after v2 had arrived), I'm not really certain about
the usefulness of the preprocessor conditionals - I'd prefer if
we went without them, but I can live with them if you strongly
think they're better than the alternative. If you keep them,
please convert the BUG_ON() to ASSERT() though, to be in
line with the #ifdef side.

Jan
George Dunlap Oct. 18, 2017, 1:49 p.m. UTC | #3
On 10/18/2017 02:41 PM, Jan Beulich wrote:
>>>> On 18.10.17 at 12:51, <george.dunlap@citrix.com> wrote:
>> --- a/xen/arch/x86/Kconfig
>> +++ b/xen/arch/x86/Kconfig
>> @@ -37,6 +37,26 @@ source "arch/Kconfig"
>>  config PV
>>  	def_bool y
>>  
>> +config PV_LINEAR_PT
>> +       bool "Support for PV linear pagetables"
>> +       depends on PV
>> +       default y
>> +       ---help---
>> +         Linear pagetables (also called "recursive pagetables") refers
>> +         to the practice of a guest operating system having pagetable
>> +         entries pointing to other pagetables of the same level (i.e.,
>> +         allowing L2 PTEs to point to other L2 pages).  Some operating
>> +         systems use it as a simple way to consisently map the current
>> +         process's pagetables into its own virtual address space.
>> +
>> +         Linux and MiniOS don't use this technique.  NetBSD and Novell
>> +         Netware do; there may be other custom operating systems which
>> +         do.  If you're certain you don't plan on having PV guests
>> +         which use this feature, turning it off can reduce the attack
>> +         surface.
>> +
>> +         If unsure, say Y.
>> +
>>  config HVM
>>  	def_bool y
> 
> Note how the options in context use tab indentation. Granted
> there are other examples of space indentation in this file, but
> at least they're using 8 spaces (except of course of the help
> text), while you're using 7.
> 
>> @@ -2320,6 +2353,7 @@ static int _put_page_type(struct page_info *page, bool preemptible,
>>                  break;
>>              }
>>  
>> +#ifdef CONFIG_PV_LINEAR_PT
>>              if ( ptpg && PGT_type_equal(x, ptpg->u.inuse.type_info) )
>>              {
>>                  /*
>> @@ -2334,6 +2368,9 @@ static int _put_page_type(struct page_info *page, bool preemptible,
>>                  ASSERT(ptpg->linear_pt_count > 0);
>>                  ptpg = NULL;
>>              }
>> +#else /* CONFIG_PV_LINEAR_PT */
>> +            BUG_ON(ptpg && PGT_type_equal(x, ptpg->u.inuse.type_info));
>> +#endif
> 
> Along the lines of my most recent reply to v1 (which I realize I
> did send only after v2 had arrived), I'm not really certain about
> the usefulness of the preprocessor conditionals - I'd prefer if
> we went without them, but I can live with them if you strongly
> think they're better than the alternative. If you keep them,
> please convert the BUG_ON() to ASSERT() though, to be in
> line with the #ifdef side.

I would argue that if linear pagetables are disabled, and we nonetheless
detect a linear pagetable, then BUG_ON() is the right behavior.  Since
we're not properly tracking any of it, it is almost certainly the result
of a security vulnerability.  Having a DoS in that case is much
preferrable to having a privilege escalation.

 -George
Jan Beulich Oct. 18, 2017, 1:59 p.m. UTC | #4
>>> On 18.10.17 at 15:49, <george.dunlap@citrix.com> wrote:
> On 10/18/2017 02:41 PM, Jan Beulich wrote:
>>>>> On 18.10.17 at 12:51, <george.dunlap@citrix.com> wrote:
>>> @@ -2334,6 +2368,9 @@ static int _put_page_type(struct page_info *page, bool preemptible,
>>>                  ASSERT(ptpg->linear_pt_count > 0);
>>>                  ptpg = NULL;
>>>              }
>>> +#else /* CONFIG_PV_LINEAR_PT */
>>> +            BUG_ON(ptpg && PGT_type_equal(x, ptpg->u.inuse.type_info));
>>> +#endif
>> 
>> Along the lines of my most recent reply to v1 (which I realize I
>> did send only after v2 had arrived), I'm not really certain about
>> the usefulness of the preprocessor conditionals - I'd prefer if
>> we went without them, but I can live with them if you strongly
>> think they're better than the alternative. If you keep them,
>> please convert the BUG_ON() to ASSERT() though, to be in
>> line with the #ifdef side.
> 
> I would argue that if linear pagetables are disabled, and we nonetheless
> detect a linear pagetable, then BUG_ON() is the right behavior.  Since
> we're not properly tracking any of it, it is almost certainly the result
> of a security vulnerability.  Having a DoS in that case is much
> preferrable to having a privilege escalation.

Okay, I can accept that argument. Which means, with the formatting
issue in Kconfig taken care of,
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan
Julien Grall Oct. 19, 2017, 11:16 a.m. UTC | #5
Hi George,

On 18/10/17 11:51, George Dunlap wrote:
> Allowing pagetables to point to other pagetables of the same level
> (often called 'linear pagetables') has been included in Xen since its
> inception; but recently it has been the source of a number of subtle
> reference-counting bugs.
> 
> It is not used by Linux or MiniOS; but it used used by NetBSD and
> Novell Netware.  There are significant numbers of people who are never
> going to use the feature, along with significant numbers who need the
> feature.
> 
> Add a Kconfig option for the feature (default to 'y').  Also add a
> command-line option to control whether PV linear pagetables are
> allowed (default to 'true').
> 
> NB that we leave linear_pt_count in the page struct.  It's in a union,
> so its presence doesn't increase the size of the data struct.
> Changing the layout of the other elements based on configuration
> options is asking for trouble however; so we'll just leave it there
> and ASSERT that it's zero.
> 
> Reported-by: Jann Horn <jannh@google.com>
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
> ---
> Changes since v1
> - Remove stray blank lines added from previous patch
> - Leave pg->linear_pt_count present, assert it's 0
> - Rename variable to opt_pv_linear_pt
> - Add spaces around #ifdef/#else/#endif for large code block
> - Add /* CONFIG_LINEAR_PV_PT */ after #else/#endif for clarity
> - Correct documented default value
> - Mention in documentation that the option is only available if configured
> - Move config option to below PV (for the day when we make that selectable)
> 
> Changes since XSA
> - Add a Kconfig option
> - Default to 'on' (rather than 'off').
> 
> Release justification: This was originally part of a security fix
> embargoed until after the freeze date; it wasn't checked in with the
> other security patches in order to allow a discussion about the
> default.

On the basis it was part of the XSA:

Release-acked-by: Julien Grall <julien.grall@linaro.org>

> diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
> index 64955dc017..a8bbaa652b 100644
> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -37,6 +37,26 @@ source "arch/Kconfig"
>   config PV
>   	def_bool y
>   
> +config PV_LINEAR_PT
> +       bool "Support for PV linear pagetables"
> +       depends on PV
> +       default y
> +       ---help---
> +         Linear pagetables (also called "recursive pagetables") refers
> +         to the practice of a guest operating system having pagetable
> +         entries pointing to other pagetables of the same level (i.e.,
> +         allowing L2 PTEs to point to other L2 pages).  Some operating
> +         systems use it as a simple way to consisently map the current

s/consisently/consistently/

> +         process's pagetables into its own virtual address space.
> +
> +         Linux and MiniOS don't use this technique.  NetBSD and Novell
> +         Netware do; there may be other custom operating systems which
> +         do.  If you're certain you don't plan on having PV guests
> +         which use this feature, turning it off can reduce the attack
> +         surface.
> +
> +         If unsure, say Y.
> +
>   config HVM
>   	def_bool y
>   

Cheers,
diff mbox

Patch

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index eb4995e68b..781110d4b2 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1422,6 +1422,25 @@  The following resources are available:
     CDP, one COS will corespond two CBMs other than one with CAT, due to the
     sum of CBMs is fixed, that means actual `cos_max` in use will automatically
     reduce to half when CDP is enabled.
+	
+### pv-linear-pt
+> `= <boolean>`
+
+> Default: `true`
+
+Only available if Xen is compiled with CONFIG\_PV\_LINEAR\_PT support
+enabled.
+
+Allow PV guests to have pagetable entries pointing to other pagetables
+of the same level (i.e., allowing L2 PTEs to point to other L2 pages).
+This technique is often called "linear pagetables", and is sometimes
+used to allow operating systems a simple way to consistently map the
+current process's pagetables into its own virtual address space.
+
+Linux and MiniOS don't use this technique.  NetBSD and Novell Netware
+do; there may be other custom operating systems which do.  If you're
+certain you don't plan on having PV guests which use this feature,
+turning it off can reduce the attack surface.
 
 ### rcu-idle-timer-period-ms
 > `= <integer>`
diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 64955dc017..a8bbaa652b 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -37,6 +37,26 @@  source "arch/Kconfig"
 config PV
 	def_bool y
 
+config PV_LINEAR_PT
+       bool "Support for PV linear pagetables"
+       depends on PV
+       default y
+       ---help---
+         Linear pagetables (also called "recursive pagetables") refers
+         to the practice of a guest operating system having pagetable
+         entries pointing to other pagetables of the same level (i.e.,
+         allowing L2 PTEs to point to other L2 pages).  Some operating
+         systems use it as a simple way to consisently map the current
+         process's pagetables into its own virtual address space.
+
+         Linux and MiniOS don't use this technique.  NetBSD and Novell
+         Netware do; there may be other custom operating systems which
+         do.  If you're certain you don't plan on having PV guests
+         which use this feature, turning it off can reduce the attack
+         surface.
+
+         If unsure, say Y.
+
 config HVM
 	def_bool y
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 62d313e3f5..2f9febd1ee 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -587,6 +587,8 @@  static void put_data_page(
         put_page(page);
 }
 
+#ifdef CONFIG_PV_LINEAR_PT
+
 static bool inc_linear_entries(struct page_info *pg)
 {
     typeof(pg->linear_pt_count) nc = read_atomic(&pg->linear_pt_count), oc;
@@ -654,6 +656,9 @@  static void dec_linear_uses(struct page_info *pg)
  *     frame if it is mapped by a different root table. This is sufficient and
  *     also necessary to allow validation of a root table mapping itself.
  */
+static bool __read_mostly opt_pv_linear_pt = true;
+boolean_param("pv-linear-pt", opt_pv_linear_pt);
+
 #define define_get_linear_pagetable(level)                                  \
 static int                                                                  \
 get_##level##_linear_pagetable(                                             \
@@ -663,6 +668,13 @@  get_##level##_linear_pagetable(                                             \
     struct page_info *page;                                                 \
     unsigned long pfn;                                                      \
                                                                             \
+    if ( !opt_pv_linear_pt )                                                \
+    {                                                                       \
+        gdprintk(XENLOG_WARNING,                                            \
+                 "Attempt to create linear p.t. (feature disabled)\n");     \
+        return 0;                                                           \
+    }                                                                       \
+                                                                            \
     if ( (level##e_get_flags(pde) & _PAGE_RW) )                             \
     {                                                                       \
         gdprintk(XENLOG_WARNING,                                            \
@@ -720,6 +732,27 @@  get_##level##_linear_pagetable(                                             \
     return 1;                                                               \
 }
 
+#else /* CONFIG_PV_LINEAR_PT */
+
+#define define_get_linear_pagetable(level)                              \
+static int                                                              \
+get_##level##_linear_pagetable(                                         \
+        level##_pgentry_t pde, unsigned long pde_pfn, struct domain *d) \
+{                                                                       \
+        return 0;                                                       \
+}
+
+static void dec_linear_uses(struct page_info *pg)
+{
+    ASSERT(pg->linear_pt_count == 0);
+}
+
+static void dec_linear_entries(struct page_info *pg)
+{
+    ASSERT(pg->linear_pt_count == 0);
+}
+
+#endif /* CONFIG_PV_LINEAR_PT */
 
 bool is_iomem_page(mfn_t mfn)
 {
@@ -2320,6 +2353,7 @@  static int _put_page_type(struct page_info *page, bool preemptible,
                 break;
             }
 
+#ifdef CONFIG_PV_LINEAR_PT
             if ( ptpg && PGT_type_equal(x, ptpg->u.inuse.type_info) )
             {
                 /*
@@ -2334,6 +2368,9 @@  static int _put_page_type(struct page_info *page, bool preemptible,
                 ASSERT(ptpg->linear_pt_count > 0);
                 ptpg = NULL;
             }
+#else /* CONFIG_PV_LINEAR_PT */
+            BUG_ON(ptpg && PGT_type_equal(x, ptpg->u.inuse.type_info));
+#endif
 
             set_tlbflush_timestamp(page);
         }