diff mbox series

[RFC,4/6] x86: PCI: preserve IORESOURCE_STARTALIGN alignment

Message ID 20240709133610.1089420-5-stewart.hildebrand@amd.com (mailing list archive)
State New
Delegated to: Bjorn Helgaas
Headers show
Series PCI: align small (<4k) BARs | expand

Commit Message

Stewart Hildebrand July 9, 2024, 1:36 p.m. UTC
Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
x86 due to the alignment being overwritten in
pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
make it work on x86.

Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
---
RFC: We don't have enough info in this function to re-calculate the
     alignment value in case of IORESOURCE_STARTALIGN. Luckily our
     alignment value seems to be intact, so just don't touch it...
     Alternatively, we could call pci_reassigndev_resource_alignment()
     after the loop. Would that be preferable?
---
 arch/x86/pci/i386.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Bjorn Helgaas July 9, 2024, 4:19 p.m. UTC | #1
On Tue, Jul 09, 2024 at 09:36:01AM -0400, Stewart Hildebrand wrote:
> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
> x86 due to the alignment being overwritten in
> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
> make it work on x86.

Is this a regression?  I didn't look up when IORESOURCE_STARTALIGN was
added, but likely it was for some situation on x86, so presumably it
worked at one time.  If something broke it in the meantime, it would
be nice to identify the commit that broke it.

Nit: follow the subject line conventions for this and the other
patches.  Learn them with "git log --oneline".  For this patch,
"x86/PCI: <Capitalized text>" is appropriate.

> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> ---
> RFC: We don't have enough info in this function to re-calculate the
>      alignment value in case of IORESOURCE_STARTALIGN. Luckily our
>      alignment value seems to be intact, so just don't touch it...
>      Alternatively, we could call pci_reassigndev_resource_alignment()
>      after the loop. Would that be preferable?
> ---
>  arch/x86/pci/i386.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
> index f2f4a5d50b27..ff6e61389ec7 100644
> --- a/arch/x86/pci/i386.c
> +++ b/arch/x86/pci/i386.c
> @@ -283,8 +283,11 @@ static void pcibios_allocate_dev_resources(struct pci_dev *dev, int pass)
>  						/* We'll assign a new address later */
>  						pcibios_save_fw_addr(dev,
>  								idx, r->start);
> -						r->end -= r->start;
> -						r->start = 0;
> +						if (!(r->flags &
> +						      IORESOURCE_STARTALIGN)) {
> +							r->end -= r->start;
> +							r->start = 0;
> +						}
>  					}
>  				}
>  			}
> -- 
> 2.45.2
>
Ilpo Järvinen July 10, 2024, 2:05 p.m. UTC | #2
On Tue, 9 Jul 2024, Stewart Hildebrand wrote:

> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
> x86 due to the alignment being overwritten in
> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
> make it work on x86.
> 
> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> ---
> RFC: We don't have enough info in this function to re-calculate the
>      alignment value in case of IORESOURCE_STARTALIGN. Luckily our
>      alignment value seems to be intact, so just don't touch it...
>      Alternatively, we could call pci_reassigndev_resource_alignment()
>      after the loop. Would that be preferable?
> ---
>  arch/x86/pci/i386.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
> index f2f4a5d50b27..ff6e61389ec7 100644
> --- a/arch/x86/pci/i386.c
> +++ b/arch/x86/pci/i386.c
> @@ -283,8 +283,11 @@ static void pcibios_allocate_dev_resources(struct pci_dev *dev, int pass)
>  						/* We'll assign a new address later */
>  						pcibios_save_fw_addr(dev,
>  								idx, r->start);
> -						r->end -= r->start;
> -						r->start = 0;
> +						if (!(r->flags &
> +						      IORESOURCE_STARTALIGN)) {
> +							r->end -= r->start;
> +							r->start = 0;
> +						}
>  					}
>  				}
>  			}
> 

As a general comment to that loop in pcibios_allocate_dev_resources() 
function, it would be nice to reverse some of the logic in the if 
conditions and use continue to limit the runaway indentation level.
Stewart Hildebrand July 10, 2024, 4:16 p.m. UTC | #3
On 7/9/24 12:19, Bjorn Helgaas wrote:
> On Tue, Jul 09, 2024 at 09:36:01AM -0400, Stewart Hildebrand wrote:
>> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
>> x86 due to the alignment being overwritten in
>> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
>> make it work on x86.
> 
> Is this a regression?  I didn't look up when IORESOURCE_STARTALIGN was
> added, but likely it was for some situation on x86, so presumably it
> worked at one time.  If something broke it in the meantime, it would
> be nice to identify the commit that broke it.

No, I don't have reason to believe it's a regression.

IORESOURCE_STARTALIGN was introduced in commit 884525655d07 ("PCI: clean
up resource alignment management").

There are some interesting commits mentioning 884525655d07:
5f17cfce5776 ("PCI: fix pbus_size_mem() resource alignment for CardBus
              controllers")
934b7024f0ed ("Fix cardbus resource allocation")

It would seem that 884525655d07 missed updating the bits in
arch/x86/pci/i386.c. I don't think a Fixes: tag is strictly necessary
because I think the issue would only start to trigger once the default
alignment is updated in the last patch of this series.

As far as I can tell, it only breaks in a certain corner case that's not
really possible to trigger yet. The corner case seems to be the
following:
* Only on x86
* The BAR has been set in firmware
* Alignment has been requested via IORESOURCE_STARTALIGN
* The IORESOURCE_UNSET flag is set
* Only PCI_STD_RESOURCES and PCI_IOV_RESOURCES resources (not bridge
  windows)

I think the reason this hasn't been seen until now is that it's not
possible to request IORESOURCE_STARTALIGN alignment via the
pci=resource_alignment= option. That option instead sets
IORESOURCE_SIZEALIGN, and in that case it works fine.

After the last patch in this series that changes the default alignment,
we will be starting to use IORESOURCE_STARTALIGN on all
not-sufficiently-aligned resources, and the corner case would be more
likely (rather, possible at all) to trigger.

My commit message is overstating the severity of the issue. So, how
about I make the commit message less scary:

There is a corner case in pcibios_allocate_dev_resources() that doesn't
account for IORESOURCE_STARTALIGN, in which case the alignment would be
overwritten. As far as I can tell, the corner case is not yet possible
to trigger, but it will be possible once the resource alignment changes.
Account for IORESOURCE_STARTALIGN in preparation for changing the
default resource alignment.

> Nit: follow the subject line conventions for this and the other
> patches.  Learn them with "git log --oneline".  For this patch,
> "x86/PCI: <Capitalized text>" is appropriate.

Thanks for pointing this out, I'll fix

>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>> ---
>> RFC: We don't have enough info in this function to re-calculate the
>>      alignment value in case of IORESOURCE_STARTALIGN. Luckily our
>>      alignment value seems to be intact, so just don't touch it...
>>      Alternatively, we could call pci_reassigndev_resource_alignment()
>>      after the loop. Would that be preferable?

Any comments on this? After some more thought, I think calling
pci_reassigndev_resource_alignment() after the loop is probably more
correct, so if there aren't any comments, I'll plan on changing it.

>> ---
>>  arch/x86/pci/i386.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
>> index f2f4a5d50b27..ff6e61389ec7 100644
>> --- a/arch/x86/pci/i386.c
>> +++ b/arch/x86/pci/i386.c
>> @@ -283,8 +283,11 @@ static void pcibios_allocate_dev_resources(struct pci_dev *dev, int pass)
>>  						/* We'll assign a new address later */
>>  						pcibios_save_fw_addr(dev,
>>  								idx, r->start);
>> -						r->end -= r->start;
>> -						r->start = 0;
>> +						if (!(r->flags &
>> +						      IORESOURCE_STARTALIGN)) {
>> +							r->end -= r->start;
>> +							r->start = 0;
>> +						}
>>  					}
>>  				}
>>  			}
>> -- 
>> 2.45.2
>>
Bjorn Helgaas July 10, 2024, 9:24 p.m. UTC | #4
On Wed, Jul 10, 2024 at 12:16:24PM -0400, Stewart Hildebrand wrote:
> On 7/9/24 12:19, Bjorn Helgaas wrote:
> > On Tue, Jul 09, 2024 at 09:36:01AM -0400, Stewart Hildebrand wrote:
> >> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
> >> x86 due to the alignment being overwritten in
> >> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
> >> make it work on x86.
> > 
> > Is this a regression?  I didn't look up when IORESOURCE_STARTALIGN was
> > added, but likely it was for some situation on x86, so presumably it
> > worked at one time.  If something broke it in the meantime, it would
> > be nice to identify the commit that broke it.
> 
> No, I don't have reason to believe it's a regression.
> 
> IORESOURCE_STARTALIGN was introduced in commit 884525655d07 ("PCI: clean
> up resource alignment management").

Ah, OK.  IORESOURCE_STARTALIGN is used for bridge windows, which don't
need to be aligned on their size as BARs do.  It would be terrible if
that usage was broken, which is why I was alarmed by the idea of it
not working on x86.

But this patch is only relevant for BARs.  I was a little confused
about IORESOURCE_STARTALIGN for a BAR, but I guess the point is to
force alignment on *more* than the BAR's size, e.g., to prevent
multiple BARs from being put in the same page.

Bottom line, this would need to be a little more specific so it
doesn't suggest that IORESOURCE_STARTALIGN for windows is broken.

IIUC, the main purpose of the series is to align all BARs to at least
4K.  I don't think the series relies on IORESOURCE_STARTALIGN to do
that.  But there's an issue with "pci=resource_alignment=..." that you
noticed sort of incidentally, and this patch fixes that?  If so, it's
important to mention that parameter.

> >> RFC: We don't have enough info in this function to re-calculate the
> >>      alignment value in case of IORESOURCE_STARTALIGN. Luckily our
> >>      alignment value seems to be intact, so just don't touch it...
> >>      Alternatively, we could call pci_reassigndev_resource_alignment()
> >>      after the loop. Would that be preferable?
> 
> Any comments on this? After some more thought, I think calling
> pci_reassigndev_resource_alignment() after the loop is probably more
> correct, so if there aren't any comments, I'll plan on changing it.

Sounds like this might be a separate patch unless it logically has to
be part of this one to avoid an issue.

> >> ---
> >>  arch/x86/pci/i386.c | 7 +++++--
> >>  1 file changed, 5 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
> >> index f2f4a5d50b27..ff6e61389ec7 100644
> >> --- a/arch/x86/pci/i386.c
> >> +++ b/arch/x86/pci/i386.c
> >> @@ -283,8 +283,11 @@ static void pcibios_allocate_dev_resources(struct pci_dev *dev, int pass)
> >>  						/* We'll assign a new address later */
> >>  						pcibios_save_fw_addr(dev,
> >>  								idx, r->start);
> >> -						r->end -= r->start;
> >> -						r->start = 0;
> >> +						if (!(r->flags &
> >> +						      IORESOURCE_STARTALIGN)) {
> >> +							r->end -= r->start;
> >> +							r->start = 0;
> >> +						}

I wondered why this only touched x86 and whether other arches need a
similar change.  This is used in two paths:

  1) pcibios_resource_survey_bus(), which is only implemented by x86

  2) pcibios_resource_survey(), which is implemented by x86 and
  powerpc.  The powerpc pcibios_allocate_resources() is similar to the
  x86 pcibios_allocate_dev_resources(), but powerpc doesn't have the
  r->end and r->start updates you're making conditional.

So it looks like x86 is indeed the only place that needs this change.
None of this stuff is arch-specific, so it's a shame that we don't
have generic code for it all.  Sigh, oh well.

> >>  					}
> >>  				}
> >>  			}
> >> -- 
> >> 2.45.2
> >>
>
Stewart Hildebrand July 10, 2024, 10:49 p.m. UTC | #5
On 7/10/24 17:24, Bjorn Helgaas wrote:
> On Wed, Jul 10, 2024 at 12:16:24PM -0400, Stewart Hildebrand wrote:
>> On 7/9/24 12:19, Bjorn Helgaas wrote:
>>> On Tue, Jul 09, 2024 at 09:36:01AM -0400, Stewart Hildebrand wrote:
>>>> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
>>>> x86 due to the alignment being overwritten in
>>>> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
>>>> make it work on x86.
>>>
>>> Is this a regression?  I didn't look up when IORESOURCE_STARTALIGN was
>>> added, but likely it was for some situation on x86, so presumably it
>>> worked at one time.  If something broke it in the meantime, it would
>>> be nice to identify the commit that broke it.
>>
>> No, I don't have reason to believe it's a regression.
>>
>> IORESOURCE_STARTALIGN was introduced in commit 884525655d07 ("PCI: clean
>> up resource alignment management").
> 
> Ah, OK.  IORESOURCE_STARTALIGN is used for bridge windows, which don't
> need to be aligned on their size as BARs do.  It would be terrible if
> that usage was broken, which is why I was alarmed by the idea of it
> not working on x86> 
> But this patch is only relevant for BARs.  I was a little confused
> about IORESOURCE_STARTALIGN for a BAR, but I guess the point is to
> force alignment on *more* than the BAR's size, e.g., to prevent
> multiple BARs from being put in the same page.
> 
> Bottom line, this would need to be a little more specific so it
> doesn't suggest that IORESOURCE_STARTALIGN for windows is broken.

I'll make the commit message clearer.

> IIUC, the main purpose of the series is to align all BARs to at least
> 4K.  I don't think the series relies on IORESOURCE_STARTALIGN to do
> that.

Yes, it does rely on IORESOURCE_STARTALIGN for BARs.

> But there's an issue with "pci=resource_alignment=..." that you
> noticed sort of incidentally, and this patch fixes that?

No, pci=resource_alignment= results in IORESOURCE_SIZEALIGN, which
breaks pcitest. And we'd like pcitest to work properly for PCI
passthrough validation with Xen, hence the need for
IORESOURCE_STARTALIGN.

> If so, it's
> important to mention that parameter.
> 
>>>> RFC: We don't have enough info in this function to re-calculate the
>>>>      alignment value in case of IORESOURCE_STARTALIGN. Luckily our
>>>>      alignment value seems to be intact, so just don't touch it...
>>>>      Alternatively, we could call pci_reassigndev_resource_alignment()
>>>>      after the loop. Would that be preferable?
>>
>> Any comments on this? After some more thought, I think calling
>> pci_reassigndev_resource_alignment() after the loop is probably more
>> correct, so if there aren't any comments, I'll plan on changing it.
> 
> Sounds like this might be a separate patch unless it logically has to
> be part of this one to avoid an issue
> 
>>>> ---
>>>>  arch/x86/pci/i386.c | 7 +++++--
>>>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
>>>> index f2f4a5d50b27..ff6e61389ec7 100644
>>>> --- a/arch/x86/pci/i386.c
>>>> +++ b/arch/x86/pci/i386.c
>>>> @@ -283,8 +283,11 @@ static void pcibios_allocate_dev_resources(struct pci_dev *dev, int pass)
>>>>  						/* We'll assign a new address later */
>>>>  						pcibios_save_fw_addr(dev,
>>>>  								idx, r->start);
>>>> -						r->end -= r->start;
>>>> -						r->start = 0;
>>>> +						if (!(r->flags &
>>>> +						      IORESOURCE_STARTALIGN)) {
>>>> +							r->end -= r->start;
>>>> +							r->start = 0;
>>>> +						}
> 
> I wondered why this only touched x86 and whether other arches need a
> similar change.  This is used in two paths:
> 
>   1) pcibios_resource_survey_bus(), which is only implemented by x86
> 
>   2) pcibios_resource_survey(), which is implemented by x86 and
>   powerpc.  The powerpc pcibios_allocate_resources() is similar to the
>   x86 pcibios_allocate_dev_resources(), but powerpc doesn't have the
>   r->end and r->start updates you're making conditional.
> 
> So it looks like x86 is indeed the only place that needs this change.
> None of this stuff is arch-specific, so it's a shame that we don't
> have generic code for it all.  Sigh, oh well.
> 
>>>>  					}
>>>>  				}
>>>>  			}
>>>> -- 
>>>> 2.45.2
>>>>
>>
Bjorn Helgaas July 11, 2024, 6:40 p.m. UTC | #6
On Wed, Jul 10, 2024 at 06:49:42PM -0400, Stewart Hildebrand wrote:
> On 7/10/24 17:24, Bjorn Helgaas wrote:
> > On Wed, Jul 10, 2024 at 12:16:24PM -0400, Stewart Hildebrand wrote:
> >> On 7/9/24 12:19, Bjorn Helgaas wrote:
> >>> On Tue, Jul 09, 2024 at 09:36:01AM -0400, Stewart Hildebrand wrote:
> >>>> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
> >>>> x86 due to the alignment being overwritten in
> >>>> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
> >>>> make it work on x86.
> >>>
> >>> Is this a regression?  I didn't look up when IORESOURCE_STARTALIGN was
> >>> added, but likely it was for some situation on x86, so presumably it
> >>> worked at one time.  If something broke it in the meantime, it would
> >>> be nice to identify the commit that broke it.
> >>
> >> No, I don't have reason to believe it's a regression.
> >>
> >> IORESOURCE_STARTALIGN was introduced in commit 884525655d07 ("PCI: clean
> >> up resource alignment management").
> > 
> > Ah, OK.  IORESOURCE_STARTALIGN is used for bridge windows, which don't
> > need to be aligned on their size as BARs do.  It would be terrible if
> > that usage was broken, which is why I was alarmed by the idea of it
> > not working on x86> 
> > But this patch is only relevant for BARs.  I was a little confused
> > about IORESOURCE_STARTALIGN for a BAR, but I guess the point is to
> > force alignment on *more* than the BAR's size, e.g., to prevent
> > multiple BARs from being put in the same page.
> > 
> > Bottom line, this would need to be a little more specific so it
> > doesn't suggest that IORESOURCE_STARTALIGN for windows is broken.
> 
> I'll make the commit message clearer.
> 
> > IIUC, the main purpose of the series is to align all BARs to at least
> > 4K.  I don't think the series relies on IORESOURCE_STARTALIGN to do
> > that.
> 
> Yes, it does rely on IORESOURCE_STARTALIGN for BARs.

Oh, I missed that, sorry.  The only places I see that set
IORESOURCE_STARTALIGN are pci_request_resource_alignment(), which is
where I got the "pci=resource_alignment=..." connection, and
pbus_size_io(), pbus_size_mem(), and pci_bus_size_cardbus(), which are
for bridge windows, AFAICS.

Doesn't the >= 4K alignment in this series hinge on the
pcibios_default_alignment() change?  It looks like that would force at
least 4K alignment independent of IORESOURCE_STARTALIGN.

> > But there's an issue with "pci=resource_alignment=..." that you
> > noticed sort of incidentally, and this patch fixes that?
> 
> No, pci=resource_alignment= results in IORESOURCE_SIZEALIGN, which
> breaks pcitest. And we'd like pcitest to work properly for PCI
> passthrough validation with Xen, hence the need for
> IORESOURCE_STARTALIGN.
Stewart Hildebrand July 11, 2024, 6:58 p.m. UTC | #7
On 7/11/24 14:40, Bjorn Helgaas wrote:
> On Wed, Jul 10, 2024 at 06:49:42PM -0400, Stewart Hildebrand wrote:
>> On 7/10/24 17:24, Bjorn Helgaas wrote:
>>> On Wed, Jul 10, 2024 at 12:16:24PM -0400, Stewart Hildebrand wrote:
>>>> On 7/9/24 12:19, Bjorn Helgaas wrote:
>>>>> On Tue, Jul 09, 2024 at 09:36:01AM -0400, Stewart Hildebrand wrote:
>>>>>> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
>>>>>> x86 due to the alignment being overwritten in
>>>>>> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
>>>>>> make it work on x86.
>>>>>
>>>>> Is this a regression?  I didn't look up when IORESOURCE_STARTALIGN was
>>>>> added, but likely it was for some situation on x86, so presumably it
>>>>> worked at one time.  If something broke it in the meantime, it would
>>>>> be nice to identify the commit that broke it.
>>>>
>>>> No, I don't have reason to believe it's a regression.
>>>>
>>>> IORESOURCE_STARTALIGN was introduced in commit 884525655d07 ("PCI: clean
>>>> up resource alignment management").
>>>
>>> Ah, OK.  IORESOURCE_STARTALIGN is used for bridge windows, which don't
>>> need to be aligned on their size as BARs do.  It would be terrible if
>>> that usage was broken, which is why I was alarmed by the idea of it
>>> not working on x86> 
>>> But this patch is only relevant for BARs.  I was a little confused
>>> about IORESOURCE_STARTALIGN for a BAR, but I guess the point is to
>>> force alignment on *more* than the BAR's size, e.g., to prevent
>>> multiple BARs from being put in the same page.
>>>
>>> Bottom line, this would need to be a little more specific so it
>>> doesn't suggest that IORESOURCE_STARTALIGN for windows is broken.
>>
>> I'll make the commit message clearer.
>>
>>> IIUC, the main purpose of the series is to align all BARs to at least
>>> 4K.  I don't think the series relies on IORESOURCE_STARTALIGN to do
>>> that.
>>
>> Yes, it does rely on IORESOURCE_STARTALIGN for BARs.
> 
> Oh, I missed that, sorry.  The only places I see that set
> IORESOURCE_STARTALIGN are pci_request_resource_alignment(), which is
> where I got the "pci=resource_alignment=..." connection, and
> pbus_size_io(), pbus_size_mem(), and pci_bus_size_cardbus(), which are
> for bridge windows, AFAICS.
> 
> Doesn't the >= 4K alignment in this series hinge on the
> pcibios_default_alignment() change?

Yep

> It looks like that would force at
> least 4K alignment independent of IORESOURCE_STARTALIGN.

Changing pcibios_default_alignment() (without pci=resource_alignment=
specified) results in IORESOURCE_STARTALIGN.

>>> But there's an issue with "pci=resource_alignment=..." that you
>>> noticed sort of incidentally, and this patch fixes that?
>>
>> No, pci=resource_alignment= results in IORESOURCE_SIZEALIGN, which
>> breaks pcitest. And we'd like pcitest to work properly for PCI
>> passthrough validation with Xen, hence the need for
>> IORESOURCE_STARTALIGN.
Bjorn Helgaas July 11, 2024, 8:35 p.m. UTC | #8
[+cc Yongji Xie]

On Thu, Jul 11, 2024 at 02:58:24PM -0400, Stewart Hildebrand wrote:
> On 7/11/24 14:40, Bjorn Helgaas wrote:
> > On Wed, Jul 10, 2024 at 06:49:42PM -0400, Stewart Hildebrand wrote:
> >> On 7/10/24 17:24, Bjorn Helgaas wrote:
> >>> On Wed, Jul 10, 2024 at 12:16:24PM -0400, Stewart Hildebrand wrote:
> >>>> On 7/9/24 12:19, Bjorn Helgaas wrote:
> >>>>> On Tue, Jul 09, 2024 at 09:36:01AM -0400, Stewart Hildebrand wrote:
> >>>>>> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
> >>>>>> x86 due to the alignment being overwritten in
> >>>>>> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
> >>>>>> make it work on x86.
> ...

> >>> IIUC, the main purpose of the series is to align all BARs to at least
> >>> 4K.  I don't think the series relies on IORESOURCE_STARTALIGN to do
> >>> that.
> >>
> >> Yes, it does rely on IORESOURCE_STARTALIGN for BARs.
> > 
> > Oh, I missed that, sorry.  The only places I see that set
> > IORESOURCE_STARTALIGN are pci_request_resource_alignment(), which is
> > where I got the "pci=resource_alignment=..." connection, and
> > pbus_size_io(), pbus_size_mem(), and pci_bus_size_cardbus(), which are
> > for bridge windows, AFAICS.
> > 
> > Doesn't the >= 4K alignment in this series hinge on the
> > pcibios_default_alignment() change?
> 
> Yep
> 
> > It looks like that would force at
> > least 4K alignment independent of IORESOURCE_STARTALIGN.
> 
> Changing pcibios_default_alignment() (without pci=resource_alignment=
> specified) results in IORESOURCE_STARTALIGN.

Mmmm.  I guess it's this path:

  pci_device_add
    pci_reassigndev_resource_alignment
      align = pci_specified_resource_alignment(&resize)
	pcibios_default_alignment
      for (i = 0; i <= PCI_ROM_RESOURCE; i++)
	pci_request_resource_alignment(i, align, resize)
	  if (!resize)
	    r->flags |= IORESOURCE_STARTALIGN

where "resize" is false because the device wasn't mentioned in a
"pci=resource_alignment=..." parameter, so
pci_request_resource_alignment() sets IORESOURCE_STARTALIGN.

When 0a701aa63784 ("PCI: Add pcibios_default_alignment() for
arch-specific alignment control") added pcibios_default_alignment(),
we got a way to do arch-specific alignment, but if the alignment is
non-zero, the implementation *also* applies that alignment to ALL
devices in the system.

Prior to 0a701aa63784, I think pci_specified_resource_alignment()
only caused increased alignment for devices mentioned with a
"pci=resource_alignment=..." parameter.

I suppose the change to do it for all devices was intentional because
382746376993 ("powerpc/powernv: Override pcibios_default_alignment()
to force PCI devices to be page aligned") says it's for all PCI
devices on PowerNV.

Since 0a701aa63784 and 382746376993 were for VFIO, which is generic, I
kind of wish that we'd done it in a more generic way instead of making
a pcibios interface that is only implemented for PowerNV.

This series does make it generic by doing it in the weak
pcibios_default_alignment() that's used by default, so that's good.

It's ancient history now, but I'm also a little unsure about the way
pci_reassigndev_resource_alignment() is kind of tacked on at the end
in pci_device_add() and not integrated with the usual BAR sizing and
allocation machinery.

> >>> But there's an issue with "pci=resource_alignment=..." that you
> >>> noticed sort of incidentally, and this patch fixes that?
> >>
> >> No, pci=resource_alignment= results in IORESOURCE_SIZEALIGN, which
> >> breaks pcitest. And we'd like pcitest to work properly for PCI
> >> passthrough validation with Xen, hence the need for
> >> IORESOURCE_STARTALIGN.

Thanks for working on this.

Bjorn
Stewart Hildebrand July 15, 2024, 5:26 p.m. UTC | #9
On 7/10/24 17:24, Bjorn Helgaas wrote:
> On Wed, Jul 10, 2024 at 12:16:24PM -0400, Stewart Hildebrand wrote:
>> On 7/9/24 12:19, Bjorn Helgaas wrote:
>>> On Tue, Jul 09, 2024 at 09:36:01AM -0400, Stewart Hildebrand wrote:
>>>> diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
>>>> index f2f4a5d50b27..ff6e61389ec7 100644
>>>> --- a/arch/x86/pci/i386.c
>>>> +++ b/arch/x86/pci/i386.c
>>>> @@ -283,8 +283,11 @@ static void pcibios_allocate_dev_resources(struct pci_dev *dev, int pass)
>>>>  						/* We'll assign a new address later */
>>>>  						pcibios_save_fw_addr(dev,
>>>>  								idx, r->start);
>>>> -						r->end -= r->start;
>>>> -						r->start = 0;
>>>> +						if (!(r->flags &
>>>> +						      IORESOURCE_STARTALIGN)) {
>>>> +							r->end -= r->start;
>>>> +							r->start = 0;
>>>> +						}
> 
> I wondered why this only touched x86 and whether other arches need a
> similar change.  This is used in two paths:
> 
>   1) pcibios_resource_survey_bus(), which is only implemented by x86
> 
>   2) pcibios_resource_survey(), which is implemented by x86 and
>   powerpc.  The powerpc pcibios_allocate_resources() is similar to the
>   x86 pcibios_allocate_dev_resources(), but powerpc doesn't have the
>   r->end and r->start updates you're making conditional.

Actually it does. It's in a separate function, alloc_resource(). I'll
make the change over there too.

> So it looks like x86 is indeed the only place that needs this change.
> None of this stuff is arch-specific, so it's a shame that we don't
> have generic code for it all.  Sigh, oh well.
> 
>>>>  					}
>>>>  				}
>>>>  			}
>>>> -- 
>>>> 2.45.2
>>>>
>>
Stewart Hildebrand July 15, 2024, 5:30 p.m. UTC | #10
On 7/10/24 10:05, Ilpo Järvinen wrote:
> On Tue, 9 Jul 2024, Stewart Hildebrand wrote:
> 
>> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
>> x86 due to the alignment being overwritten in
>> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
>> make it work on x86.
>>
>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>> ---
>> RFC: We don't have enough info in this function to re-calculate the
>>      alignment value in case of IORESOURCE_STARTALIGN. Luckily our
>>      alignment value seems to be intact, so just don't touch it...
>>      Alternatively, we could call pci_reassigndev_resource_alignment()
>>      after the loop. Would that be preferable?
>> ---
>>  arch/x86/pci/i386.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
>> index f2f4a5d50b27..ff6e61389ec7 100644
>> --- a/arch/x86/pci/i386.c
>> +++ b/arch/x86/pci/i386.c
>> @@ -283,8 +283,11 @@ static void pcibios_allocate_dev_resources(struct pci_dev *dev, int pass)
>>  						/* We'll assign a new address later */
>>  						pcibios_save_fw_addr(dev,
>>  								idx, r->start);
>> -						r->end -= r->start;
>> -						r->start = 0;
>> +						if (!(r->flags &
>> +						      IORESOURCE_STARTALIGN)) {
>> +							r->end -= r->start;
>> +							r->start = 0;
>> +						}
>>  					}
>>  				}
>>  			}
>>
> 
> As a general comment to that loop in pcibios_allocate_dev_resources() 
> function, it would be nice to reverse some of the logic in the if 
> conditions and use continue to limit the runaway indentation level.

The similar function pcibios_allocate_resources() in
arch/powerpc/kernel/pci-common.c has moved some of the logic out into a
separate function. I'll do the same here.
diff mbox series

Patch

diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
index f2f4a5d50b27..ff6e61389ec7 100644
--- a/arch/x86/pci/i386.c
+++ b/arch/x86/pci/i386.c
@@ -283,8 +283,11 @@  static void pcibios_allocate_dev_resources(struct pci_dev *dev, int pass)
 						/* We'll assign a new address later */
 						pcibios_save_fw_addr(dev,
 								idx, r->start);
-						r->end -= r->start;
-						r->start = 0;
+						if (!(r->flags &
+						      IORESOURCE_STARTALIGN)) {
+							r->end -= r->start;
+							r->start = 0;
+						}
 					}
 				}
 			}