diff mbox

[PATCHv8,0/5] Driver for new "VMD" device

Message ID 20160222221024.GA20879@localhost (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Bjorn Helgaas Feb. 22, 2016, 10:10 p.m. UTC
On Wed, Jan 20, 2016 at 04:01:11PM -0600, Bjorn Helgaas wrote:
> On Sat, Jan 16, 2016 at 02:19:38PM -0800, Veal, Bryan E. wrote:
> > On Fri, Jan 15, 2016 at 03:49:02PM -0600, Bjorn Helgaas wrote:
> > > Even though you found this issue before posting the RFC code, I assume
> > > the issue is still relevant in the current code, and you still want to
> > > clear IORESOURCE_MEM_64, right?
> > 
> > Yes.
> > 
> > > This is where I get confused.  IORESOURCE_MEM_64 *should* mean "the
> > > hardware register associated with this resource can accommodate a
> > > 64-bit value."  If we're using IORESOURCE_MEM_64 to decide whether to
> > > use a prefetchable vs. a non-prefetchable window, that sounds broken.
> > > 
> > > Can you point me to the relevant code, and maybe give an example?  I'm
> > > pretty sure the code doesn't completely match the spec, and maybe this
> > > is a case where we have to set the flags non-intuitively to get the
> > > desired result.
> > > 
> > > > Below the port, the "prefetchable" propoerty
> > > > *is* restrictive: the addresses can't be used for non-prefetchable BARs.
> > > > 
> > > > Thus, in the specific case where a 64-bit non-prefetchable VMD bar happens
> > > > to contain a 32-bit address, removing the IORESOURCE_MEM_64 flag allows
> > > > the address resource to be used for *any* non-prefetchable BARs (32-bit or
> > > > 64-bit) downstream.  
> > > 
> > > If I understand correctly, these VMD BARs (VMD_MEMBAR1 and
> > > VMD_MEMBAR2) effectively become the host bridge windows available for
> > > devices below the VMD.
> > > 
> > > I infer that if the VMD host bridge window is non-prefetchable and has
> > > IORESOURCE_MEM_64 set, we won't put a 32-bit non-prefetchable BAR in
> > > that window.  That sounds like a bug, but let me be the first to admit
> > > that I don't understand our PCI resource allocation code.
> > 
> > I don't think anything is broken. You are correct that the MEMBARs are
> > used as a host bridge window. The reason to clear the flag is a side
> > effect of that.
> > 
> > For BARs, the flags describe capabilities. For resources, they are
> > interpreted as restrictions.
> > 
> > If VMD has a 32-bit resource in a 64-bit non-prefetchable BAR, without
> > clearing the flag, it yields a host bridge resource, and thus root bus
> > resource, with IORESOURCE_MEM_64 set.
> > 
> > Downstream of VMD, the root port's 32-bit non-prefetchable base/limit
> > registers can't handle the 64-bit resource, but the 64-bit prefetchable
> > window can, so that's where it ends up. (See pci_bus_alloc_resource().)
> 
> OK, I think I finally found the critical comment, which is in
> __pci_assign_resource():
> 
>   Even if a 64-bit prefetchable bridge window is below 4GB, we can't
>   put a 32-bit prefetchable resource in it because pbus_size_mem()
>   assumes a 64-bit window will contain no 32-bit resources.  If we
>   assign things differently than they were sized, not everything will
>   fit.
> 
> There's no reason we can't put a Root Port's 32-bit non-prefetchable
> window inside a 64-bit VMD window that happens to be below 4GB,
> *except* for the fact that pbus_size_mem() assumes we won't do that.
> 
> The VMD code needs a reference to that comment.
> 
> I guess you're relying on BIOS to assign your non-prefetchable VMD BAR
> below 4GB even though it's a 64-bit BAR?  If Linux assigned that BAR,
> e.g., after a hot-add of a VMD, we might put it above 4GB, and then
> Root Ports downstream from the VMD would not be able to use any
> non-prefetchable space.

I see another VMD patch on the list, but I'm still waiting for
resolution to this comment and question.  For the first one, about
clearing IORESOURCE_MEM_64, I have in mind something like the
following patch.

I'm not sure how to deal with the question of a hot-added VMD.  Maybe
all we can do now is add a comment to the effect that we assume BIOS
has assigned the non-prefetchable BAR below 4GB, and if Linux assigns
that BAR for hot-added VMDs, that assumption will likely break.

Bjorn


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Keith Busch Feb. 23, 2016, 6:24 p.m. UTC | #1
On Mon, Feb 22, 2016 at 04:10:24PM -0600, Bjorn Helgaas wrote:
> I'm not sure how to deal with the question of a hot-added VMD.  Maybe
> all we can do now is add a comment to the effect that we assume BIOS
> has assigned the non-prefetchable BAR below 4GB, and if Linux assigns
> that BAR for hot-added VMDs, that assumption will likely break.

Yes, we can assume BIOS always assigns. There are other BIOS dependencies
in order for the host to see the h/w as a VMD endpoint.

> diff --git a/arch/x86/pci/vmd.c b/arch/x86/pci/vmd.c
> index d57e480..7554722 100644
> --- a/arch/x86/pci/vmd.c
> +++ b/arch/x86/pci/vmd.c
> @@ -532,6 +532,16 @@ static int vmd_enable_domain(struct vmd_dev *vmd)
>  		.flags = IORESOURCE_BUS | IORESOURCE_PCI_FIXED,
>  	};
>  
> +	/*
> +	 * If the window is below 4GB, clear IORESOURCE_MEM_64 so we can
> +	 * put 32-bit resources in the window.
> +	 *
> +	 * There's no hardware reason why a 64-bit window *couldn't*
> +	 * contain a 32-bit resource, but pbus_size_mem() computes the
> +	 * bridge window size assuming a 64-bit window will contain no
> +	 * 32-bit resources.  __pci_assign_resource() enforces that
> +	 * artificial restriction to make sure everything will fit.
> +	 */

This sounds good to me. Thanks!

>  	res = &vmd->dev->resource[VMD_MEMBAR1];
>  	upper_bits = upper_32_bits(res->end);
>  	flags = res->flags & ~IORESOURCE_SIZEALIGN;
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Feb. 25, 2016, 2:42 p.m. UTC | #2
On Tue, Feb 23, 2016 at 06:24:00PM +0000, Keith Busch wrote:
> On Mon, Feb 22, 2016 at 04:10:24PM -0600, Bjorn Helgaas wrote:
> > I'm not sure how to deal with the question of a hot-added VMD.  Maybe
> > all we can do now is add a comment to the effect that we assume BIOS
> > has assigned the non-prefetchable BAR below 4GB, and if Linux assigns
> > that BAR for hot-added VMDs, that assumption will likely break.
> 
> Yes, we can assume BIOS always assigns. There are other BIOS dependencies
> in order for the host to see the h/w as a VMD endpoint.
> 
> > diff --git a/arch/x86/pci/vmd.c b/arch/x86/pci/vmd.c
> > index d57e480..7554722 100644
> > --- a/arch/x86/pci/vmd.c
> > +++ b/arch/x86/pci/vmd.c
> > @@ -532,6 +532,16 @@ static int vmd_enable_domain(struct vmd_dev *vmd)
> >  		.flags = IORESOURCE_BUS | IORESOURCE_PCI_FIXED,
> >  	};
> >  
> > +	/*
> > +	 * If the window is below 4GB, clear IORESOURCE_MEM_64 so we can
> > +	 * put 32-bit resources in the window.
> > +	 *
> > +	 * There's no hardware reason why a 64-bit window *couldn't*
> > +	 * contain a 32-bit resource, but pbus_size_mem() computes the
> > +	 * bridge window size assuming a 64-bit window will contain no
> > +	 * 32-bit resources.  __pci_assign_resource() enforces that
> > +	 * artificial restriction to make sure everything will fit.
> > +	 */
> 
> This sounds good to me. Thanks!
> 
> >  	res = &vmd->dev->resource[VMD_MEMBAR1];
> >  	upper_bits = upper_32_bits(res->end);
> >  	flags = res->flags & ~IORESOURCE_SIZEALIGN;

Can you prepare a patch, that adds both comments, please?  (The one
about how we assume BIOS assigns the BAR below 4GB, and the one I
drafted above.)

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keith Busch Feb. 25, 2016, 2:50 p.m. UTC | #3
On Thu, Feb 25, 2016 at 08:42:19AM -0600, Bjorn Helgaas wrote:
> On Tue, Feb 23, 2016 at 06:24:00PM +0000, Keith Busch wrote:
> > On Mon, Feb 22, 2016 at 04:10:24PM -0600, Bjorn Helgaas wrote:
> > > +	/*
> > > +	 * If the window is below 4GB, clear IORESOURCE_MEM_64 so we can
> > > +	 * put 32-bit resources in the window.
> > > +	 *
> > > +	 * There's no hardware reason why a 64-bit window *couldn't*
> > > +	 * contain a 32-bit resource, but pbus_size_mem() computes the
> > > +	 * bridge window size assuming a 64-bit window will contain no
> > > +	 * 32-bit resources.  __pci_assign_resource() enforces that
> > > +	 * artificial restriction to make sure everything will fit.
> > > +	 */
> > 
> > This sounds good to me. Thanks!
> > 
> > >  	res = &vmd->dev->resource[VMD_MEMBAR1];
> > >  	upper_bits = upper_32_bits(res->end);
> > >  	flags = res->flags & ~IORESOURCE_SIZEALIGN;
> 
> Can you prepare a patch, that adds both comments, please?  (The one
> about how we assume BIOS assigns the BAR below 4GB, and the one I
> drafted above.)

Yep, will send it out be end of today.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keith Busch Feb. 26, 2016, 3:29 p.m. UTC | #4
Hi Bryan

I want to make sure I get the right message for Bjorn, so held off on
sending anything before confirming.

As far as clearing the 64-bit resource flag when it's a 32-bit address,
that seems pretty straight forward as captured in the comment.

But do we actually rely on a memory window below 4GB? If the memory
window is above 4GB, we just won't be able to support devices with 32-bit
BARs there, but is that a concern? Should the comment just say that if
the window is assigned >4GB, we simply won't be able to support 32-bit
BAR devices on that domain, and that's 'ok'?

Thanks,
Keith

On Thu, Feb 25, 2016 at 08:42:19AM -0600, Bjorn Helgaas wrote:
> On Tue, Feb 23, 2016 at 06:24:00PM +0000, Keith Busch wrote:
> > On Mon, Feb 22, 2016 at 04:10:24PM -0600, Bjorn Helgaas wrote:
> > > I'm not sure how to deal with the question of a hot-added VMD.  Maybe
> > > all we can do now is add a comment to the effect that we assume BIOS
> > > has assigned the non-prefetchable BAR below 4GB, and if Linux assigns
> > > that BAR for hot-added VMDs, that assumption will likely break.
> > 
> > Yes, we can assume BIOS always assigns. There are other BIOS dependencies
> > in order for the host to see the h/w as a VMD endpoint.
> > 
> > > diff --git a/arch/x86/pci/vmd.c b/arch/x86/pci/vmd.c
> > > index d57e480..7554722 100644
> > > --- a/arch/x86/pci/vmd.c
> > > +++ b/arch/x86/pci/vmd.c
> > > @@ -532,6 +532,16 @@ static int vmd_enable_domain(struct vmd_dev *vmd)
> > >  		.flags = IORESOURCE_BUS | IORESOURCE_PCI_FIXED,
> > >  	};
> > >  
> > > +	/*
> > > +	 * If the window is below 4GB, clear IORESOURCE_MEM_64 so we can
> > > +	 * put 32-bit resources in the window.
> > > +	 *
> > > +	 * There's no hardware reason why a 64-bit window *couldn't*
> > > +	 * contain a 32-bit resource, but pbus_size_mem() computes the
> > > +	 * bridge window size assuming a 64-bit window will contain no
> > > +	 * 32-bit resources.  __pci_assign_resource() enforces that
> > > +	 * artificial restriction to make sure everything will fit.
> > > +	 */
> > 
> > This sounds good to me. Thanks!
> > 
> > >  	res = &vmd->dev->resource[VMD_MEMBAR1];
> > >  	upper_bits = upper_32_bits(res->end);
> > >  	flags = res->flags & ~IORESOURCE_SIZEALIGN;
> 
> Can you prepare a patch, that adds both comments, please?  (The one
> about how we assume BIOS assigns the BAR below 4GB, and the one I
> drafted above.)
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/pci/vmd.c b/arch/x86/pci/vmd.c
index d57e480..7554722 100644
--- a/arch/x86/pci/vmd.c
+++ b/arch/x86/pci/vmd.c
@@ -532,6 +532,16 @@  static int vmd_enable_domain(struct vmd_dev *vmd)
 		.flags = IORESOURCE_BUS | IORESOURCE_PCI_FIXED,
 	};
 
+	/*
+	 * If the window is below 4GB, clear IORESOURCE_MEM_64 so we can
+	 * put 32-bit resources in the window.
+	 *
+	 * There's no hardware reason why a 64-bit window *couldn't*
+	 * contain a 32-bit resource, but pbus_size_mem() computes the
+	 * bridge window size assuming a 64-bit window will contain no
+	 * 32-bit resources.  __pci_assign_resource() enforces that
+	 * artificial restriction to make sure everything will fit.
+	 */
 	res = &vmd->dev->resource[VMD_MEMBAR1];
 	upper_bits = upper_32_bits(res->end);
 	flags = res->flags & ~IORESOURCE_SIZEALIGN;