diff mbox series

[RESEND,1/1] vfio/nvlink: Remove exec permission to avoid SELinux AVCs

Message ID 8f5ae1658103b71b09555d3ab499edaf3f36a15d.camel@gmail.com (mailing list archive)
State New, archived
Headers show
Series [RESEND,1/1] vfio/nvlink: Remove exec permission to avoid SELinux AVCs | expand

Commit Message

Leonardo Bras May 18, 2020, 3:05 p.m. UTC
If SELinux is setup without 'execmem' permission for qemu, all mmap
with (PROT_WRITE | PROT_EXEC) will fail and print a warning in
SELinux log.

If "nvlink2-mr" memory allocation fails (fist diff), it will cause
guest NUMA nodes to not be correctly configured (V100 memory will
not be visible for guest, nor its NUMA nodes).

Not having 'execmem' permission is intesting for virtual machines to
avoid buffer-overflow based attacks, and it's adopted in distros
like RHEL.

So, removing the PROT_EXEC flag seems the right thing to do.

Browsing some other code that mmaps memory for usage with
memory_region_init_ram_device_ptr, I could notice it's usual to
not have PROT_EXEC (only PROT_READ | PROT_WRITE), so it should be
no problem around this.

Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>

---
- Alexey's review is here: https://lists.nongnu.org/archive/html/qemu-devel/2020-05/msg00006.html

 hw/vfio/pci-quirks.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Alex Williamson May 26, 2020, 8:43 p.m. UTC | #1
On Mon, 18 May 2020 12:05:24 -0300
Leonardo Bras <leobras.c@gmail.com> wrote:

> If SELinux is setup without 'execmem' permission for qemu, all mmap
> with (PROT_WRITE | PROT_EXEC) will fail and print a warning in
> SELinux log.
> 
> If "nvlink2-mr" memory allocation fails (fist diff), it will cause
> guest NUMA nodes to not be correctly configured (V100 memory will
> not be visible for guest, nor its NUMA nodes).
> 
> Not having 'execmem' permission is intesting for virtual machines to
> avoid buffer-overflow based attacks, and it's adopted in distros
> like RHEL.
> 
> So, removing the PROT_EXEC flag seems the right thing to do.
> 
> Browsing some other code that mmaps memory for usage with
> memory_region_init_ram_device_ptr, I could notice it's usual to
> not have PROT_EXEC (only PROT_READ | PROT_WRITE), so it should be
> no problem around this.
> 
> Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> 
> ---

Seems David Gibson might be in a position to send a pull request
including this before I can, so:

Acked-by: Alex Williamson <alex.williamson@redhat.com>


> - Alexey's review is here: https://lists.nongnu.org/archive/html/qemu-devel/2020-05/msg00006.html
> 
>  hw/vfio/pci-quirks.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
> index 2d348f8237..124d4f57e1 100644
> --- a/hw/vfio/pci-quirks.c
> +++ b/hw/vfio/pci-quirks.c
> @@ -1620,7 +1620,7 @@ int vfio_pci_nvidia_v100_ram_init(VFIOPCIDevice *vdev, Error **errp)
>      }
>      cap = (void *) hdr;
>  
> -    p = mmap(NULL, nv2reg->size, PROT_READ | PROT_WRITE | PROT_EXEC,
> +    p = mmap(NULL, nv2reg->size, PROT_READ | PROT_WRITE,
>               MAP_SHARED, vdev->vbasedev.fd, nv2reg->offset);
>      if (p == MAP_FAILED) {
>          ret = -errno;
> @@ -1680,7 +1680,7 @@ int vfio_pci_nvlink2_init(VFIOPCIDevice *vdev, Error **errp)
>  
>      /* Some NVLink bridges may not have assigned ATSD */
>      if (atsdreg->size) {
> -        p = mmap(NULL, atsdreg->size, PROT_READ | PROT_WRITE | PROT_EXEC,
> +        p = mmap(NULL, atsdreg->size, PROT_READ | PROT_WRITE,
>                   MAP_SHARED, vdev->vbasedev.fd, atsdreg->offset);
>          if (p == MAP_FAILED) {
>              ret = -errno;
>
David Gibson May 27, 2020, 12:42 a.m. UTC | #2
On Tue, May 26, 2020 at 02:43:43PM -0600, Alex Williamson wrote:
> On Mon, 18 May 2020 12:05:24 -0300
> Leonardo Bras <leobras.c@gmail.com> wrote:
> 
> > If SELinux is setup without 'execmem' permission for qemu, all mmap
> > with (PROT_WRITE | PROT_EXEC) will fail and print a warning in
> > SELinux log.
> > 
> > If "nvlink2-mr" memory allocation fails (fist diff), it will cause
> > guest NUMA nodes to not be correctly configured (V100 memory will
> > not be visible for guest, nor its NUMA nodes).
> > 
> > Not having 'execmem' permission is intesting for virtual machines to
> > avoid buffer-overflow based attacks, and it's adopted in distros
> > like RHEL.
> > 
> > So, removing the PROT_EXEC flag seems the right thing to do.
> > 
> > Browsing some other code that mmaps memory for usage with
> > memory_region_init_ram_device_ptr, I could notice it's usual to
> > not have PROT_EXEC (only PROT_READ | PROT_WRITE), so it should be
> > no problem around this.
> > 
> > Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
> > Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > 
> > ---
> 
> Seems David Gibson might be in a position to send a pull request
> including this before I can, so:

Merged to ppc-for-5.1, thanks.

> 
> Acked-by: Alex Williamson <alex.williamson@redhat.com>
> 
> 
> > - Alexey's review is here: https://lists.nongnu.org/archive/html/qemu-devel/2020-05/msg00006.html
> > 
> >  hw/vfio/pci-quirks.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
> > index 2d348f8237..124d4f57e1 100644
> > --- a/hw/vfio/pci-quirks.c
> > +++ b/hw/vfio/pci-quirks.c
> > @@ -1620,7 +1620,7 @@ int vfio_pci_nvidia_v100_ram_init(VFIOPCIDevice *vdev, Error **errp)
> >      }
> >      cap = (void *) hdr;
> >  
> > -    p = mmap(NULL, nv2reg->size, PROT_READ | PROT_WRITE | PROT_EXEC,
> > +    p = mmap(NULL, nv2reg->size, PROT_READ | PROT_WRITE,
> >               MAP_SHARED, vdev->vbasedev.fd, nv2reg->offset);
> >      if (p == MAP_FAILED) {
> >          ret = -errno;
> > @@ -1680,7 +1680,7 @@ int vfio_pci_nvlink2_init(VFIOPCIDevice *vdev, Error **errp)
> >  
> >      /* Some NVLink bridges may not have assigned ATSD */
> >      if (atsdreg->size) {
> > -        p = mmap(NULL, atsdreg->size, PROT_READ | PROT_WRITE | PROT_EXEC,
> > +        p = mmap(NULL, atsdreg->size, PROT_READ | PROT_WRITE,
> >                   MAP_SHARED, vdev->vbasedev.fd, atsdreg->offset);
> >          if (p == MAP_FAILED) {
> >              ret = -errno;
> > 
> 
>
diff mbox series

Patch

diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
index 2d348f8237..124d4f57e1 100644
--- a/hw/vfio/pci-quirks.c
+++ b/hw/vfio/pci-quirks.c
@@ -1620,7 +1620,7 @@  int vfio_pci_nvidia_v100_ram_init(VFIOPCIDevice *vdev, Error **errp)
     }
     cap = (void *) hdr;
 
-    p = mmap(NULL, nv2reg->size, PROT_READ | PROT_WRITE | PROT_EXEC,
+    p = mmap(NULL, nv2reg->size, PROT_READ | PROT_WRITE,
              MAP_SHARED, vdev->vbasedev.fd, nv2reg->offset);
     if (p == MAP_FAILED) {
         ret = -errno;
@@ -1680,7 +1680,7 @@  int vfio_pci_nvlink2_init(VFIOPCIDevice *vdev, Error **errp)
 
     /* Some NVLink bridges may not have assigned ATSD */
     if (atsdreg->size) {
-        p = mmap(NULL, atsdreg->size, PROT_READ | PROT_WRITE | PROT_EXEC,
+        p = mmap(NULL, atsdreg->size, PROT_READ | PROT_WRITE,
                  MAP_SHARED, vdev->vbasedev.fd, atsdreg->offset);
         if (p == MAP_FAILED) {
             ret = -errno;