diff mbox series

[v1,kvmtool,7/7] vfio/pci: Align MSIX Table and PBA size allocation to 64k

Message ID 20210913154413.14322-8-alexandru.elisei@arm.com (mailing list archive)
State New, archived
Headers show
Series vfio/pci: Fix MSIX table and PBA size allocation | expand

Commit Message

Alexandru Elisei Sept. 13, 2021, 3:44 p.m. UTC
When allocating MMIO space for the MSI-X table, kvmtool rounds the
allocation to the host's page size to make it as easy as possible for the
guest to map the table to a page, if it wants to (and doesn't do BAR
reassignment, like the x86 architecture for example). However, the host's
page size can differ from the guest's, for example, if the host is compiled
with 4k pages and the guest is using 64k pages.

To make sure the allocation is always aligned to a guest's page size, round
it up to the maximum page size, which is 64k. Do the same for the pending
bit array if it lives in its own BAR.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 vfio/pci.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Andre Przywara Oct. 6, 2021, 3:11 p.m. UTC | #1
On Mon, 13 Sep 2021 16:44:13 +0100
Alexandru Elisei <alexandru.elisei@arm.com> wrote:

Hi,

> When allocating MMIO space for the MSI-X table, kvmtool rounds the
> allocation to the host's page size to make it as easy as possible for the
> guest to map the table to a page, if it wants to (and doesn't do BAR
> reassignment, like the x86 architecture for example). However, the host's
> page size can differ from the guest's, for example, if the host is compiled
> with 4k pages and the guest is using 64k pages.
> 
> To make sure the allocation is always aligned to a guest's page size, round
> it up to the maximum page size, which is 64k. Do the same for the pending
> bit array if it lives in its own BAR.

The idea of that looks alright on the first glance, but isn't needed for
x86, right? So should this be using an arch-specific MAX_PAGE_SIZE instead?

Cheers,
Andre

> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  vfio/pci.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/vfio/pci.c b/vfio/pci.c
> index a6d0408..7e258a4 100644
> --- a/vfio/pci.c
> +++ b/vfio/pci.c
> @@ -1,3 +1,5 @@
> +#include "linux/sizes.h"
> +
>  #include "kvm/irq.h"
>  #include "kvm/kvm.h"
>  #include "kvm/kvm-cpu.h"
> @@ -929,7 +931,7 @@ static int vfio_pci_create_msix_table(struct kvm
> *kvm, struct vfio_device *vdev) if (!info.size)
>  		return -EINVAL;
>  
> -	map_size = ALIGN(info.size, PAGE_SIZE);
> +	map_size = ALIGN(info.size, SZ_64K);
>  	table->guest_phys_addr = pci_get_mmio_block(map_size);
>  	if (!table->guest_phys_addr) {
>  		pr_err("cannot allocate MMIO space");
> @@ -960,7 +962,7 @@ static int vfio_pci_create_msix_table(struct kvm
> *kvm, struct vfio_device *vdev) if (!info.size)
>  			return -EINVAL;
>  
> -		map_size = ALIGN(info.size, PAGE_SIZE);
> +		map_size = ALIGN(info.size, SZ_64K);
>  		pba->guest_phys_addr = pci_get_mmio_block(map_size);
>  		if (!pba->guest_phys_addr) {
>  			pr_err("cannot allocate MMIO space");
Alexandru Elisei Oct. 11, 2021, 2:57 p.m. UTC | #2
Hi Andre,

On Wed, Oct 06, 2021 at 04:11:42PM +0100, Andre Przywara wrote:
> On Mon, 13 Sep 2021 16:44:13 +0100
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> Hi,
> 
> > When allocating MMIO space for the MSI-X table, kvmtool rounds the
> > allocation to the host's page size to make it as easy as possible for the
> > guest to map the table to a page, if it wants to (and doesn't do BAR
> > reassignment, like the x86 architecture for example). However, the host's
> > page size can differ from the guest's, for example, if the host is compiled
> > with 4k pages and the guest is using 64k pages.
> > 
> > To make sure the allocation is always aligned to a guest's page size, round
> > it up to the maximum page size, which is 64k. Do the same for the pending
> > bit array if it lives in its own BAR.
> 
> The idea of that looks alright on the first glance, but isn't needed for
> x86, right? So should this be using an arch-specific MAX_PAGE_SIZE instead?

By "not needed for x86", do you mean that 4k is enough and 64k is not needed? Or
that doing any kind of alignment is unnecessary? If it's the latter, Linux on
x86 relies on the BARs being programmed by the BIOS with valid addresses, so I
would rather err on the side of caution and align the BARs to at least 4k.

If it's the former, doing the alignment is not costing kvmtool anything, as the
addreses are just numbers in the PCI memory region, which is not allocated as
userspace memory by kvmtool, and it's about 1GiB, so I don't think 64 - 4 = 60k
will make much of a difference. With 100 devices, each with the MSIX table and
PBA in different BARs, that amounts to 60k * 2 * 100 ~= 12MiB. Out of 1GiB.

But I do agree that using something like MAX_PAGE_SIZE would be the correct way
to do it. I'll have a look at the page sizes for mips and powerpc and create the
define.

Thanks,
Alex

> 
> Cheers,
> Andre
> 
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> >  vfio/pci.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/vfio/pci.c b/vfio/pci.c
> > index a6d0408..7e258a4 100644
> > --- a/vfio/pci.c
> > +++ b/vfio/pci.c
> > @@ -1,3 +1,5 @@
> > +#include "linux/sizes.h"
> > +
> >  #include "kvm/irq.h"
> >  #include "kvm/kvm.h"
> >  #include "kvm/kvm-cpu.h"
> > @@ -929,7 +931,7 @@ static int vfio_pci_create_msix_table(struct kvm
> > *kvm, struct vfio_device *vdev) if (!info.size)
> >  		return -EINVAL;
> >  
> > -	map_size = ALIGN(info.size, PAGE_SIZE);
> > +	map_size = ALIGN(info.size, SZ_64K);
> >  	table->guest_phys_addr = pci_get_mmio_block(map_size);
> >  	if (!table->guest_phys_addr) {
> >  		pr_err("cannot allocate MMIO space");
> > @@ -960,7 +962,7 @@ static int vfio_pci_create_msix_table(struct kvm
> > *kvm, struct vfio_device *vdev) if (!info.size)
> >  			return -EINVAL;
> >  
> > -		map_size = ALIGN(info.size, PAGE_SIZE);
> > +		map_size = ALIGN(info.size, SZ_64K);
> >  		pba->guest_phys_addr = pci_get_mmio_block(map_size);
> >  		if (!pba->guest_phys_addr) {
> >  			pr_err("cannot allocate MMIO space");
>
diff mbox series

Patch

diff --git a/vfio/pci.c b/vfio/pci.c
index a6d0408..7e258a4 100644
--- a/vfio/pci.c
+++ b/vfio/pci.c
@@ -1,3 +1,5 @@ 
+#include "linux/sizes.h"
+
 #include "kvm/irq.h"
 #include "kvm/kvm.h"
 #include "kvm/kvm-cpu.h"
@@ -929,7 +931,7 @@  static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev)
 	if (!info.size)
 		return -EINVAL;
 
-	map_size = ALIGN(info.size, PAGE_SIZE);
+	map_size = ALIGN(info.size, SZ_64K);
 	table->guest_phys_addr = pci_get_mmio_block(map_size);
 	if (!table->guest_phys_addr) {
 		pr_err("cannot allocate MMIO space");
@@ -960,7 +962,7 @@  static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev)
 		if (!info.size)
 			return -EINVAL;
 
-		map_size = ALIGN(info.size, PAGE_SIZE);
+		map_size = ALIGN(info.size, SZ_64K);
 		pba->guest_phys_addr = pci_get_mmio_block(map_size);
 		if (!pba->guest_phys_addr) {
 			pr_err("cannot allocate MMIO space");