diff mbox

[v7,4/7] PCI/ACPI: Add interface acpi_pci_root_create()

Message ID 564185B6.8040006@linux.intel.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Jiang Liu Nov. 10, 2015, 5:50 a.m. UTC
On 2015/11/10 4:09, Arnd Bergmann wrote:
> On Monday 09 November 2015 17:10:43 Lorenzo Pieralisi wrote:
>> On Mon, Nov 09, 2015 at 03:07:38PM +0100, Tomasz Nowicki wrote:
>>> On 06.11.2015 14:22, Jiang Liu wrote:
>>>> On 2015/11/6 20:40, Tomasz Nowicki wrote:
>>>>> On 06.11.2015 12:46, Jiang Liu wrote:
>>>>>> On 2015/11/6 18:37, Tomasz Nowicki wrote:
>>>>>>> On 06.11.2015 09:52, Jiang Liu wrote:
>>>>>>> Sure, ARM64 (0-16M IO space) QEMU example:
>>>>>>> DWordIO (ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange,
>>>>>>>           0x00000000,         // Granularity
>>>>>>>           0x00000000,         // Range Minimum
>>>>>>>           0x0000FFFF,         // Range Maximum
>>>>>>>           0x3EFF0000,         // Translation Offset
>>>>>>>           0x00010000,         // Length
>>>>>>>           ,, , TypeStatic)
>>>>>> The above DWordIO resource descriptor doesn't confirm to the ACPI spec.
>>>>>> According to my understanding, ARM/ARM64 has no concept of IO port
>>>>>> address space, so the PCI host bridge will map IO port on PCI side
>>>>>> onto MMIO on host side. In other words, PCI host bridge on ARM64
>>>>>> implement a IO Port->MMIO translation instead of a IO Port->IO Port
>>>>>> translation. If that's true, it should use 'TypeTranslation' instead
>>>>>> of 'TypeStatic'. And kernel ACPI resource parsing interface doesn't
>>>>>> support 'TypeTranslation' yet, so we need to find a solution for it.
>>>>>
>>>>> I think you are right, we need TypeTranslation flag for ARM64 DWordIO
>>>>> descriptors and an extra kernel patch to support it.
>>>> How about the attached to patch to support TypeTranslation?
>>>> It only passes compilation:)
>>>
>>> Based on the further discussion, your draft patch looks good to me.
>>> Lorenzo, do you agree?
>>
>> No, because I still do not understand the difference between ia64 and
>> arm64 (they both drive IO ports cycles through MMIO so the resource
>> descriptors content must be the same or better they must mean the same
>> thing). On top of that, this is something that was heavily debated for DT:
>>
>> http://www.spinics.net/lists/arm-kernel/msg345633.html
>>
>> and I would like to get Arnd and Bjorn opinion on this because we
>> should not "interpret" ACPI specifications, we should understand
>> what they are supposed to describe and write kernel code accordingly.
>>
>> In particular, I would like to understand, for an eg DWordIO descriptor,
>> what Range Minimum, Range Maximum and Translation Offset represent,
>> they can't mean different things depending on the SW parsing them,
>> this totally defeats the purpose.
> 
> I have no clue about what those mean in ACPI though.
> 
> Generally speaking, each PCI domain is expected to have a (normally 64KB)
> range of CPU addresses that gets translated into PCI I/O space the same
> way that config space and memory space are handled.
> This is true for almost every architecture except for x86, which uses
> different CPU instructions for I/O space compared to the other spaces.
> 
>> By the way, ia64 ioremaps the translation_offset (ie new_space()), so
>> basically that's the CPU physical address at which the PCI host bridge
>> map the IO space transactions), I do not think ia64 is any different from
>> arm64 in this respect, if it is please provide an HW description here from
>> the PCI bus perspective here (also an example of ia64 ACPI PCI host bridge
>> tables would help).
> 
> The main difference between ia64 and a lot of the other architectures (e.g.
> sparc is different again) is that ia64 defines a logical address range
> in terms of having a small number for each I/O space followed by the
> offset within that space as a 'port number' and uses a mapping function
> that is defined as
> 
> static inline void *__ia64_mk_io_addr (unsigned long port)
> {
>         struct io_space *space = &io_space[IO_SPACE_NR(port)];
>         return (space->mmio_base | IO_SPACE_PORT(port););
> }
> static inline unsigned int inl(unsigned long port)
> {
>         return *__ia64_mk_io_addr(port);
> }
> 
> Most architectures allow only one I/O port range and put it at a fixed
> virtual address so that inl() simply becomes 
> 
> static inline u32 inl(unsigned long addr)
> {
>         return readl(PCI_IOBASE + addr);
> }
> 
> which noticeably reduces code size.
> 
> On some architectures (powerpc, arm, arm64), we then get the same simplified
> definition with a fixed virtual address, and use pci_ioremap_io() or
> something like that to to map a physical address range into this virtual
> address window at the correct io_offset;
Hi all,
	Thanks for explanation, I found a way to make the ACPI resource
parsing interface arch neutral, it should help to address Lorenzo's
concern. Please refer to the attached patch. (It's still RFC, not tested
yet).
Thanks,
Gerry
From 6521d3c5303b1d9830eecede4c1b778deffe17b0 Mon Sep 17 00:00:00 2001
From: Liu Jiang <jiang.liu@linux.intel.com>
Date: Tue, 10 Nov 2015 10:59:11 +0800
Subject: [PATCH] ACPI, PCI: Refine the way to handle translation_offset for
 ACPI resources

Some architectures, such as IA64 and ARM64, have no instructions to
directly access PCI IO ports, so they map PCI IO ports into PCI MMIO
address space. Typically PCI host bridges on those architectures take
the responsibility to map (translate) PCI IO port transactions into
Memory-Mapped IO transactions. ACPI specification provides support
of such a usage case by using resource translation_offset.

But current ACPI resource parsing interface isn't neutral enough,
it still has some special logic for IA64. So refine the ACPI resource
parsing interface and IA64 code to neutrally handle translation_offset
by:
1) ACPI resource parsing interface doesn't do any translation, it just
   save the translation_offset to be used by arch code.
2) Arch code will do the mapping(translation) based on arch specific
   information. Typically it does:
2.a) Translate per PCI domain IO port address space into system global
   IO port address space.
2.b) Setup MMIO address mapping for IO ports.
void handle_io_resource(struct resource_entry *io_entry)
{
	struct resource *mmio_res;

	mmio_res = kzalloc(sizeof(*mmio_res), GFP_KERNEL);
	mmio_res->flags = IORESOURCE_MEM;
	mmio_res->start = io_entry->offset + io_entry->res->start;
	mmio_res->end = io_entry->offset + io_entry->res->end;
	insert_resource(&iomem_resource, mmio_res)

	base = map_to_system_ioport_address(entry);
	io_entry->offset = base;
	io_entry->res->start += base;
	io_entry->res->end += base;
}

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/ia64/pci/pci.c     |   26 ++++++++++++++++----------
 drivers/acpi/resource.c |   12 +++++-------
 2 files changed, 21 insertions(+), 17 deletions(-)

Comments

Lorenzo Pieralisi Nov. 11, 2015, 5:46 p.m. UTC | #1
On Tue, Nov 10, 2015 at 01:50:46PM +0800, Jiang Liu wrote:

[...]

> >> In particular, I would like to understand, for an eg DWordIO descriptor,
> >> what Range Minimum, Range Maximum and Translation Offset represent,
> >> they can't mean different things depending on the SW parsing them,
> >> this totally defeats the purpose.
> > 
> > I have no clue about what those mean in ACPI though.
> > 
> > Generally speaking, each PCI domain is expected to have a (normally 64KB)
> > range of CPU addresses that gets translated into PCI I/O space the same
> > way that config space and memory space are handled.
> > This is true for almost every architecture except for x86, which uses
> > different CPU instructions for I/O space compared to the other spaces.
> > 
> >> By the way, ia64 ioremaps the translation_offset (ie new_space()), so
> >> basically that's the CPU physical address at which the PCI host bridge
> >> map the IO space transactions), I do not think ia64 is any different from
> >> arm64 in this respect, if it is please provide an HW description here from
> >> the PCI bus perspective here (also an example of ia64 ACPI PCI host bridge
> >> tables would help).
> > 
> > The main difference between ia64 and a lot of the other architectures (e.g.
> > sparc is different again) is that ia64 defines a logical address range
> > in terms of having a small number for each I/O space followed by the
> > offset within that space as a 'port number' and uses a mapping function
> > that is defined as
> > 
> > static inline void *__ia64_mk_io_addr (unsigned long port)
> > {
> >         struct io_space *space = &io_space[IO_SPACE_NR(port)];
> >         return (space->mmio_base | IO_SPACE_PORT(port););
> > }
> > static inline unsigned int inl(unsigned long port)
> > {
> >         return *__ia64_mk_io_addr(port);
> > }
> > 
> > Most architectures allow only one I/O port range and put it at a fixed
> > virtual address so that inl() simply becomes 
> > 
> > static inline u32 inl(unsigned long addr)
> > {
> >         return readl(PCI_IOBASE + addr);
> > }
> > 
> > which noticeably reduces code size.
> > 
> > On some architectures (powerpc, arm, arm64), we then get the same simplified
> > definition with a fixed virtual address, and use pci_ioremap_io() or
> > something like that to to map a physical address range into this virtual
> > address window at the correct io_offset;
> Hi all,
> 	Thanks for explanation, I found a way to make the ACPI resource
> parsing interface arch neutral, it should help to address Lorenzo's
> concern. Please refer to the attached patch. (It's still RFC, not tested
> yet).

If we go with this approach though, you are not adding the offset to
the resource when parsing the memory spaces in acpi_decode_space(), are we
sure that's what we really want ?

In DT, a host bridge range has a:

- CPU physical address
- PCI bus address

We use that to compute the offset between primary bus (ie CPU physical
address) and secondary bus (ie PCI bus address).

The value ending up in the PCI resource struct (for memory space) is
the CPU physical address, if you do not add the offset in acpi_decode_space
that does not hold true on platforms where CPU<->PCI offset != 0 on ACPI,
am I wrong ?

Overall I think the point is related to ioport_resource and its check
in acpi_pci_root_validate_resources() which basically that's the
problem that started this thread.

On arm64, IO_SPACE_LIMIT is 16M, which, AFAIK is a kernel limit, not
a HW one. Comparing the resources parsed from the PCI bridge _CRS against
the range 0..IO_SPACE_LIMIT is not necessarily meaningful (or at least
not meaningful in its current form), on ia64 it works because IO_SPACE_LIMIT
is bumped up to 4G, that's the reason why adding the offset to the ACPI IO
resources work on ia64 as far as I understand.

And that's why I pulled Arnd in this discussion since he knows better
than me: what does ioport_resource _really_ represent on ARM64 ? It seems
to me that it is a range of IO ports values (ie a window that defines
the allowed offset in the virtual address space allocated to PCI IO) that
has _nothing_ to do with the CPU physical address at which the IO space is
actually mapped.

To sum it up for a, say, DWordIo/Memory descriptor:

- AddressMinimum, AddressMaximum represent the PCI bus addresses defining
  the resource start..end
- AddressTranslation is the offset that has to be added to AddressMinimum
  and AddressMaximum to get the window in CPU physical address space

So:

- Either we go with the patch attached (but please check my comment on
  the memory spaces)
- Or we patch acpi_pci_root_validate_resources() to amend the way
  IORESOURCE_IO is currently checked against ioport_resource, it can't
  work on arm64 at present, I described why above

Thoughts appreciated it is time we got this sorted and thanks for
the patch.

Thanks,
Lorenzo
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
liviu.dudau@arm.com Nov. 11, 2015, 6:12 p.m. UTC | #2
On Wed, Nov 11, 2015 at 05:46:47PM +0000, Lorenzo Pieralisi wrote:
> On Tue, Nov 10, 2015 at 01:50:46PM +0800, Jiang Liu wrote:
> 
> [...]
> 
> > >> In particular, I would like to understand, for an eg DWordIO descriptor,
> > >> what Range Minimum, Range Maximum and Translation Offset represent,
> > >> they can't mean different things depending on the SW parsing them,
> > >> this totally defeats the purpose.
> > > 
> > > I have no clue about what those mean in ACPI though.
> > > 
> > > Generally speaking, each PCI domain is expected to have a (normally 64KB)
> > > range of CPU addresses that gets translated into PCI I/O space the same
> > > way that config space and memory space are handled.
> > > This is true for almost every architecture except for x86, which uses
> > > different CPU instructions for I/O space compared to the other spaces.
> > > 
> > >> By the way, ia64 ioremaps the translation_offset (ie new_space()), so
> > >> basically that's the CPU physical address at which the PCI host bridge
> > >> map the IO space transactions), I do not think ia64 is any different from
> > >> arm64 in this respect, if it is please provide an HW description here from
> > >> the PCI bus perspective here (also an example of ia64 ACPI PCI host bridge
> > >> tables would help).
> > > 
> > > The main difference between ia64 and a lot of the other architectures (e.g.
> > > sparc is different again) is that ia64 defines a logical address range
> > > in terms of having a small number for each I/O space followed by the
> > > offset within that space as a 'port number' and uses a mapping function
> > > that is defined as
> > > 
> > > static inline void *__ia64_mk_io_addr (unsigned long port)
> > > {
> > >         struct io_space *space = &io_space[IO_SPACE_NR(port)];
> > >         return (space->mmio_base | IO_SPACE_PORT(port););
> > > }
> > > static inline unsigned int inl(unsigned long port)
> > > {
> > >         return *__ia64_mk_io_addr(port);
> > > }
> > > 
> > > Most architectures allow only one I/O port range and put it at a fixed
> > > virtual address so that inl() simply becomes 
> > > 
> > > static inline u32 inl(unsigned long addr)
> > > {
> > >         return readl(PCI_IOBASE + addr);
> > > }
> > > 
> > > which noticeably reduces code size.
> > > 
> > > On some architectures (powerpc, arm, arm64), we then get the same simplified
> > > definition with a fixed virtual address, and use pci_ioremap_io() or
> > > something like that to to map a physical address range into this virtual
> > > address window at the correct io_offset;
> > Hi all,
> > 	Thanks for explanation, I found a way to make the ACPI resource
> > parsing interface arch neutral, it should help to address Lorenzo's
> > concern. Please refer to the attached patch. (It's still RFC, not tested
> > yet).
> 
> If we go with this approach though, you are not adding the offset to
> the resource when parsing the memory spaces in acpi_decode_space(), are we
> sure that's what we really want ?
> 
> In DT, a host bridge range has a:
> 
> - CPU physical address
> - PCI bus address
> 
> We use that to compute the offset between primary bus (ie CPU physical
> address) and secondary bus (ie PCI bus address).
> 
> The value ending up in the PCI resource struct (for memory space) is
> the CPU physical address, if you do not add the offset in acpi_decode_space
> that does not hold true on platforms where CPU<->PCI offset != 0 on ACPI,
> am I wrong ?
> 
> Overall I think the point is related to ioport_resource and its check
> in acpi_pci_root_validate_resources() which basically that's the
> problem that started this thread.
> 
> On arm64, IO_SPACE_LIMIT is 16M, which, AFAIK is a kernel limit, not
> a HW one. 

You're right, it is a kernel limit. There is no HW limit for IO on arm64.

> Comparing the resources parsed from the PCI bridge _CRS against
> the range 0..IO_SPACE_LIMIT is not necessarily meaningful (or at least
> not meaningful in its current form), on ia64 it works because IO_SPACE_LIMIT
> is bumped up to 4G, that's the reason why adding the offset to the ACPI IO
> resources work on ia64 as far as I understand.
> 
> And that's why I pulled Arnd in this discussion since he knows better
> than me: what does ioport_resource _really_ represent on ARM64 ? It seems
> to me that it is a range of IO ports values (ie a window that defines
> the allowed offset in the virtual address space allocated to PCI IO) that
> has _nothing_ to do with the CPU physical address at which the IO space is
> actually mapped.

Correct. The idea is that you can have any number of host bridges and what
you get back for an ioport_resource is a window inside the host bridge IO
space. That space is also offset-ed inside the kernel's IO port space
(0..IO_SPACE_LIMIT) by the amount of space already taken by preceeding
host bridges, so that two ioport_resources comming from two different
host bridges don't overlap in the CPU address space.

Best regards,
Liviu


> 
> To sum it up for a, say, DWordIo/Memory descriptor:
> 
> - AddressMinimum, AddressMaximum represent the PCI bus addresses defining
>   the resource start..end
> - AddressTranslation is the offset that has to be added to AddressMinimum
>   and AddressMaximum to get the window in CPU physical address space
> 
> So:
> 
> - Either we go with the patch attached (but please check my comment on
>   the memory spaces)
> - Or we patch acpi_pci_root_validate_resources() to amend the way
>   IORESOURCE_IO is currently checked against ioport_resource, it can't
>   work on arm64 at present, I described why above
> 
> Thoughts appreciated it is time we got this sorted and thanks for
> the patch.
> 
> Thanks,
> Lorenzo
>
Arnd Bergmann Nov. 11, 2015, 8:55 p.m. UTC | #3
On Wednesday 11 November 2015 17:46:47 Lorenzo Pieralisi wrote:
> On Tue, Nov 10, 2015 at 01:50:46PM +0800, Jiang Liu wrote:
> If we go with this approach though, you are not adding the offset to
> the resource when parsing the memory spaces in acpi_decode_space(), are we
> sure that's what we really want ?
> 
> In DT, a host bridge range has a:
> 
> - CPU physical address
> - PCI bus address
> 
> We use that to compute the offset between primary bus (ie CPU physical
> address) and secondary bus (ie PCI bus address).
> 
> The value ending up in the PCI resource struct (for memory space) is
> the CPU physical address, if you do not add the offset in acpi_decode_space
> that does not hold true on platforms where CPU<->PCI offset != 0 on ACPI,
> am I wrong ?

Sinan Kaya pointed out that SBSA mandates a 1:1 mapping for memory space.

> On arm64, IO_SPACE_LIMIT is 16M, which, AFAIK is a kernel limit, not
> a HW one. Comparing the resources parsed from the PCI bridge _CRS against
> the range 0..IO_SPACE_LIMIT is not necessarily meaningful (or at least
> not meaningful in its current form), on ia64 it works because IO_SPACE_LIMIT
> is bumped up to 4G, that's the reason why adding the offset to the ACPI IO
> resources work on ia64 as far as I understand.
> 
> And that's why I pulled Arnd in this discussion since he knows better
> than me: what does ioport_resource _really_ represent on ARM64 ? It seems
> to me that it is a range of IO ports values (ie a window that defines
> the allowed offset in the virtual address space allocated to PCI IO) that
> has _nothing_ to do with the CPU physical address at which the IO space is
> actually mapped.

Right, the ioport_resource uses the same space as the CPU virtual address,
with an offset of PCI_IOBASE that is defined as

#define PCI_IOBASE            ((void __iomem *)PCI_IO_VIRT_BASE)
#define PCI_IO_START            (PCI_IO_END - PCI_IO_SIZE)
#define PCI_IO_END              (MODULES_VADDR - SZ_2M)
#define PCI_IO_SIZE             SZ_16M

> To sum it up for a, say, DWordIo/Memory descriptor:
> 
> - AddressMinimum, AddressMaximum represent the PCI bus addresses defining
>   the resource start..end
> - AddressTranslation is the offset that has to be added to AddressMinimum
>   and AddressMaximum to get the window in CPU physical address space
> 
> So:
> 
> - Either we go with the patch attached (but please check my comment on
>   the memory spaces)
> - Or we patch acpi_pci_root_validate_resources() to amend the way
>   IORESOURCE_IO is currently checked against ioport_resource, it can't
>   work on arm64 at present, I described why above
> 
> Thoughts appreciated it is time we got this sorted and thanks for
> the patch.

The easiest way would be to assume that nobody is building a server
system that has multiple I/O spaces. SBSA explicitly makes I/O space
optional, and really the only reason anyone includes that feature these
days is for initializing GPUs through the BIOS POST method, so that
is not relevant for servers. Even when you do have multiple PCIe
host controllers that all support I/O space, the most logical implementation
would be to share one common space.

If that fails, there are still two cases you have to care about, and
it really depends on what hardware makers decide to use here (and whether
we have any influence over them). The easier way for us to do this is
if every hardware sets up the mapping between CPU physical and PCI I/O
in a way that the I/O space numbers are non-overlapping between the
host controllers. That way we can still make Linux ioport_resource
addresses the same as PCI I/O space addresses (using io_offset=0),
with the downside being that only the first PCIe host (as enumerated
by the bootloader) can have I/O space addresses below 1024 that may
be required for ISA compatibility on some hardware.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiang Liu Nov. 12, 2015, 8:43 a.m. UTC | #4
On 2015/11/12 1:46, Lorenzo Pieralisi wrote:
> On Tue, Nov 10, 2015 at 01:50:46PM +0800, Jiang Liu wrote:
> 
> [...]
> 
>>>> In particular, I would like to understand, for an eg DWordIO descriptor,
>>>> what Range Minimum, Range Maximum and Translation Offset represent,
>>>> they can't mean different things depending on the SW parsing them,
>>>> this totally defeats the purpose.
>>>
>>> I have no clue about what those mean in ACPI though.
>>>
>>> Generally speaking, each PCI domain is expected to have a (normally 64KB)
>>> range of CPU addresses that gets translated into PCI I/O space the same
>>> way that config space and memory space are handled.
>>> This is true for almost every architecture except for x86, which uses
>>> different CPU instructions for I/O space compared to the other spaces.
>>>
>>>> By the way, ia64 ioremaps the translation_offset (ie new_space()), so
>>>> basically that's the CPU physical address at which the PCI host bridge
>>>> map the IO space transactions), I do not think ia64 is any different from
>>>> arm64 in this respect, if it is please provide an HW description here from
>>>> the PCI bus perspective here (also an example of ia64 ACPI PCI host bridge
>>>> tables would help).
>>>
>>> The main difference between ia64 and a lot of the other architectures (e.g.
>>> sparc is different again) is that ia64 defines a logical address range
>>> in terms of having a small number for each I/O space followed by the
>>> offset within that space as a 'port number' and uses a mapping function
>>> that is defined as
>>>
>>> static inline void *__ia64_mk_io_addr (unsigned long port)
>>> {
>>>         struct io_space *space = &io_space[IO_SPACE_NR(port)];
>>>         return (space->mmio_base | IO_SPACE_PORT(port););
>>> }
>>> static inline unsigned int inl(unsigned long port)
>>> {
>>>         return *__ia64_mk_io_addr(port);
>>> }
>>>
>>> Most architectures allow only one I/O port range and put it at a fixed
>>> virtual address so that inl() simply becomes 
>>>
>>> static inline u32 inl(unsigned long addr)
>>> {
>>>         return readl(PCI_IOBASE + addr);
>>> }
>>>
>>> which noticeably reduces code size.
>>>
>>> On some architectures (powerpc, arm, arm64), we then get the same simplified
>>> definition with a fixed virtual address, and use pci_ioremap_io() or
>>> something like that to to map a physical address range into this virtual
>>> address window at the correct io_offset;
>> Hi all,
>> 	Thanks for explanation, I found a way to make the ACPI resource
>> parsing interface arch neutral, it should help to address Lorenzo's
>> concern. Please refer to the attached patch. (It's still RFC, not tested
>> yet).
> 
> If we go with this approach though, you are not adding the offset to
> the resource when parsing the memory spaces in acpi_decode_space(), are we
> sure that's what we really want ?
> 
> In DT, a host bridge range has a:
> 
> - CPU physical address
> - PCI bus address
> 
> We use that to compute the offset between primary bus (ie CPU physical
> address) and secondary bus (ie PCI bus address).
> 
> The value ending up in the PCI resource struct (for memory space) is
> the CPU physical address, if you do not add the offset in acpi_decode_space
> that does not hold true on platforms where CPU<->PCI offset != 0 on ACPI,
> am I wrong ?
Hi Lorenzo,
	I may have found the divergence between us about the design here. You
treat it as a one-stage translation but I treat it as a
two-stage translation as below:
stage 1: map(translate) per-PCI-domain IO port address[0, 16M) into
system global IO port address. Here system global IO port address is
ioport_resource[0, IO_SPACE_LIMIT).
stage 2: map system IO port address into system memory address.

We need two objects of struct resource_win to support above two-stage
translation. One object, type of IORESOURCE_IO, is used to support
stage one, and it will also used to allocate IO port resources
for PCI devices. Another object, type of IORESOURCE_MMIO, is used
to allocate resource from iomem_resource and setup MMIO mapping
to actually access IO ports.

For ARM64, it doesn't support multiple per-PCI-domain(bus local)
IO port address space yet, so stage one seems to be optional
becomes the offset between bus local IO port address and system
IO port address is always 0. But we still need two objects of
struct resource_win. The first object is
	{
		offset:0,
		start:AddressMinimum,
		end:AddressMaximum,
		flags:IORESOURCE_IO
	}
Here it's type of IORESOURCE_IO and offset must be zero because
pcibios_resource_to_bus() will access it translate system IO
port address into bus local IO port address. With my patch,
the struct resource_win object created by the ACPI core will
be reused for this.

The second object is:
	{
		offset:Translation_Offset,
		start:AddressMinimum + Translation_Offset,
		end:AddressMaximum + Translation_Offset,
		flags:IORESOURCE_MMIO
	}
Arch code need to create the second struct resource_win object
and actually setup the MMIO mapping.

But there's really another bug need to get fixed, funciton
acpi_dev_ioresource_flags() assumes bus local IO port address
space is size of 64K, which is wrong for IA64 and ARM64.

Thanks,
Gerry

> 
> Overall I think the point is related to ioport_resource and its check
> in acpi_pci_root_validate_resources() which basically that's the
> problem that started this thread.
> 
> On arm64, IO_SPACE_LIMIT is 16M, which, AFAIK is a kernel limit, not
> a HW one. Comparing the resources parsed from the PCI bridge _CRS against
> the range 0..IO_SPACE_LIMIT is not necessarily meaningful (or at least
> not meaningful in its current form), on ia64 it works because IO_SPACE_LIMIT
> is bumped up to 4G, that's the reason why adding the offset to the ACPI IO
> resources work on ia64 as far as I understand.
> 
> And that's why I pulled Arnd in this discussion since he knows better
> than me: what does ioport_resource _really_ represent on ARM64 ? It seems
> to me that it is a range of IO ports values (ie a window that defines
> the allowed offset in the virtual address space allocated to PCI IO) that
> has _nothing_ to do with the CPU physical address at which the IO space is
> actually mapped.
> 
> To sum it up for a, say, DWordIo/Memory descriptor:
> 
> - AddressMinimum, AddressMaximum represent the PCI bus addresses defining
>   the resource start..end
> - AddressTranslation is the offset that has to be added to AddressMinimum
>   and AddressMaximum to get the window in CPU physical address space
> 
> So:
> 
> - Either we go with the patch attached (but please check my comment on
>   the memory spaces)
> - Or we patch acpi_pci_root_validate_resources() to amend the way
>   IORESOURCE_IO is currently checked against ioport_resource, it can't
>   work on arm64 at present, I described why above
> 
> Thoughts appreciated it is time we got this sorted and thanks for
> the patch.
> 
> Thanks,
> Lorenzo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Pieralisi Nov. 12, 2015, 12:08 p.m. UTC | #5
On Wed, Nov 11, 2015 at 09:55:37PM +0100, Arnd Bergmann wrote:
> On Wednesday 11 November 2015 17:46:47 Lorenzo Pieralisi wrote:
> > On Tue, Nov 10, 2015 at 01:50:46PM +0800, Jiang Liu wrote:
> > If we go with this approach though, you are not adding the offset to
> > the resource when parsing the memory spaces in acpi_decode_space(), are we
> > sure that's what we really want ?
> > 
> > In DT, a host bridge range has a:
> > 
> > - CPU physical address
> > - PCI bus address
> > 
> > We use that to compute the offset between primary bus (ie CPU physical
> > address) and secondary bus (ie PCI bus address).
> > 
> > The value ending up in the PCI resource struct (for memory space) is
> > the CPU physical address, if you do not add the offset in acpi_decode_space
> > that does not hold true on platforms where CPU<->PCI offset != 0 on ACPI,
> > am I wrong ?
> 
> Sinan Kaya pointed out that SBSA mandates a 1:1 mapping for memory space.

acpi_decode_space() is generic code though, it has to work on all arches,
I was worried about triggering regressions on x86 and ia64 here.

[...]

> > To sum it up for a, say, DWordIo/Memory descriptor:
> > 
> > - AddressMinimum, AddressMaximum represent the PCI bus addresses defining
> >   the resource start..end
> > - AddressTranslation is the offset that has to be added to AddressMinimum
> >   and AddressMaximum to get the window in CPU physical address space
> > 
> > So:
> > 
> > - Either we go with the patch attached (but please check my comment on
> >   the memory spaces)
> > - Or we patch acpi_pci_root_validate_resources() to amend the way
> >   IORESOURCE_IO is currently checked against ioport_resource, it can't
> >   work on arm64 at present, I described why above
> > 
> > Thoughts appreciated it is time we got this sorted and thanks for
> > the patch.
> 
> The easiest way would be to assume that nobody is building a server
> system that has multiple I/O spaces. SBSA explicitly makes I/O space
> optional, and really the only reason anyone includes that feature these
> days is for initializing GPUs through the BIOS POST method, so that
> is not relevant for servers. Even when you do have multiple PCIe
> host controllers that all support I/O space, the most logical implementation
> would be to share one common space.
> 
> If that fails, there are still two cases you have to care about, and
> it really depends on what hardware makers decide to use here (and whether
> we have any influence over them). The easier way for us to do this is
> if every hardware sets up the mapping between CPU physical and PCI I/O
> in a way that the I/O space numbers are non-overlapping between the
> host controllers. That way we can still make Linux ioport_resource
> addresses the same as PCI I/O space addresses (using io_offset=0),
> with the downside being that only the first PCIe host (as enumerated
> by the bootloader) can have I/O space addresses below 1024 that may
> be required for ISA compatibility on some hardware.

Yes, I think the approach we have in place on arm64 sound, I will work with
Jiang to make sure we interpret and manage IO space on arm64 the same
way we do with DT, I do not expect any issue on that side, I was more
worried about the interpretation of ACPI tables, I really really do not
want to end up with ACPI tables that are set-up in a way that can cause
regressions, we have to agree (and make it crystal clear in the ACPI
specs) what the resource descriptors define and how, then the ACPI kernel
resource layer should be made compliant.

Thanks,
Lorenzo
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tomasz Nowicki Nov. 12, 2015, 1:21 p.m. UTC | #6
On 12.11.2015 09:43, Jiang Liu wrote:
> On 2015/11/12 1:46, Lorenzo Pieralisi wrote:
>> On Tue, Nov 10, 2015 at 01:50:46PM +0800, Jiang Liu wrote:
>>
>> [...]
>>
>>>>> In particular, I would like to understand, for an eg DWordIO descriptor,
>>>>> what Range Minimum, Range Maximum and Translation Offset represent,
>>>>> they can't mean different things depending on the SW parsing them,
>>>>> this totally defeats the purpose.
>>>>
>>>> I have no clue about what those mean in ACPI though.
>>>>
>>>> Generally speaking, each PCI domain is expected to have a (normally 64KB)
>>>> range of CPU addresses that gets translated into PCI I/O space the same
>>>> way that config space and memory space are handled.
>>>> This is true for almost every architecture except for x86, which uses
>>>> different CPU instructions for I/O space compared to the other spaces.
>>>>
>>>>> By the way, ia64 ioremaps the translation_offset (ie new_space()), so
>>>>> basically that's the CPU physical address at which the PCI host bridge
>>>>> map the IO space transactions), I do not think ia64 is any different from
>>>>> arm64 in this respect, if it is please provide an HW description here from
>>>>> the PCI bus perspective here (also an example of ia64 ACPI PCI host bridge
>>>>> tables would help).
>>>>
>>>> The main difference between ia64 and a lot of the other architectures (e.g.
>>>> sparc is different again) is that ia64 defines a logical address range
>>>> in terms of having a small number for each I/O space followed by the
>>>> offset within that space as a 'port number' and uses a mapping function
>>>> that is defined as
>>>>
>>>> static inline void *__ia64_mk_io_addr (unsigned long port)
>>>> {
>>>>          struct io_space *space = &io_space[IO_SPACE_NR(port)];
>>>>          return (space->mmio_base | IO_SPACE_PORT(port););
>>>> }
>>>> static inline unsigned int inl(unsigned long port)
>>>> {
>>>>          return *__ia64_mk_io_addr(port);
>>>> }
>>>>
>>>> Most architectures allow only one I/O port range and put it at a fixed
>>>> virtual address so that inl() simply becomes
>>>>
>>>> static inline u32 inl(unsigned long addr)
>>>> {
>>>>          return readl(PCI_IOBASE + addr);
>>>> }
>>>>
>>>> which noticeably reduces code size.
>>>>
>>>> On some architectures (powerpc, arm, arm64), we then get the same simplified
>>>> definition with a fixed virtual address, and use pci_ioremap_io() or
>>>> something like that to to map a physical address range into this virtual
>>>> address window at the correct io_offset;
>>> Hi all,
>>> 	Thanks for explanation, I found a way to make the ACPI resource
>>> parsing interface arch neutral, it should help to address Lorenzo's
>>> concern. Please refer to the attached patch. (It's still RFC, not tested
>>> yet).
>>
>> If we go with this approach though, you are not adding the offset to
>> the resource when parsing the memory spaces in acpi_decode_space(), are we
>> sure that's what we really want ?
>>
>> In DT, a host bridge range has a:
>>
>> - CPU physical address
>> - PCI bus address
>>
>> We use that to compute the offset between primary bus (ie CPU physical
>> address) and secondary bus (ie PCI bus address).
>>
>> The value ending up in the PCI resource struct (for memory space) is
>> the CPU physical address, if you do not add the offset in acpi_decode_space
>> that does not hold true on platforms where CPU<->PCI offset != 0 on ACPI,
>> am I wrong ?
> Hi Lorenzo,
> 	I may have found the divergence between us about the design here. You
> treat it as a one-stage translation but I treat it as a
> two-stage translation as below:
> stage 1: map(translate) per-PCI-domain IO port address[0, 16M) into
> system global IO port address. Here system global IO port address is
> ioport_resource[0, IO_SPACE_LIMIT).
> stage 2: map system IO port address into system memory address.
>
> We need two objects of struct resource_win to support above two-stage
> translation. One object, type of IORESOURCE_IO, is used to support
> stage one, and it will also used to allocate IO port resources
> for PCI devices. Another object, type of IORESOURCE_MMIO, is used
> to allocate resource from iomem_resource and setup MMIO mapping
> to actually access IO ports.
>
> For ARM64, it doesn't support multiple per-PCI-domain(bus local)
> IO port address space yet, so stage one seems to be optional
> becomes the offset between bus local IO port address and system
> IO port address is always 0. But we still need two objects of
> struct resource_win. The first object is
> 	{
> 		offset:0,
> 		start:AddressMinimum,
> 		end:AddressMaximum,
> 		flags:IORESOURCE_IO
> 	}
> Here it's type of IORESOURCE_IO and offset must be zero because
> pcibios_resource_to_bus() will access it translate system IO
> port address into bus local IO port address. With my patch,
> the struct resource_win object created by the ACPI core will
> be reused for this.
>
> The second object is:
> 	{
> 		offset:Translation_Offset,
> 		start:AddressMinimum + Translation_Offset,
> 		end:AddressMaximum + Translation_Offset,
> 		flags:IORESOURCE_MMIO
> 	}
> Arch code need to create the second struct resource_win object
> and actually setup the MMIO mapping.
>
> But there's really another bug need to get fixed, funciton
> acpi_dev_ioresource_flags() assumes bus local IO port address
> space is size of 64K, which is wrong for IA64 and ARM64.
>

So what would be the Translation_Offset meaning for two cases DWordIo 
(....,TypeTranslation) vs DWordIo (....,TypeStatic)? And why we did not 
use TypeTranslation for IA64 so far?

I am worried that TypeTranslation fall into the IA64 category but ACPI 
tables were already written incorrectly.

Tomasz

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiang Liu Nov. 12, 2015, 2:04 p.m. UTC | #7
On 2015/11/12 21:21, Tomasz Nowicki wrote:
> On 12.11.2015 09:43, Jiang Liu wrote:
>> On 2015/11/12 1:46, Lorenzo Pieralisi wrote:
>>> On Tue, Nov 10, 2015 at 01:50:46PM +0800, Jiang Liu wrote:
>>>
>>> [...]
>>>
>>>>>> In particular, I would like to understand, for an eg DWordIO
>>>>>> descriptor,
>>>>>> what Range Minimum, Range Maximum and Translation Offset represent,
>>>>>> they can't mean different things depending on the SW parsing them,
>>>>>> this totally defeats the purpose.
>>>>>
>>>>> I have no clue about what those mean in ACPI though.
>>>>>
>>>>> Generally speaking, each PCI domain is expected to have a (normally
>>>>> 64KB)
>>>>> range of CPU addresses that gets translated into PCI I/O space the
>>>>> same
>>>>> way that config space and memory space are handled.
>>>>> This is true for almost every architecture except for x86, which uses
>>>>> different CPU instructions for I/O space compared to the other spaces.
>>>>>
>>>>>> By the way, ia64 ioremaps the translation_offset (ie new_space()), so
>>>>>> basically that's the CPU physical address at which the PCI host
>>>>>> bridge
>>>>>> map the IO space transactions), I do not think ia64 is any
>>>>>> different from
>>>>>> arm64 in this respect, if it is please provide an HW description
>>>>>> here from
>>>>>> the PCI bus perspective here (also an example of ia64 ACPI PCI
>>>>>> host bridge
>>>>>> tables would help).
>>>>>
>>>>> The main difference between ia64 and a lot of the other
>>>>> architectures (e.g.
>>>>> sparc is different again) is that ia64 defines a logical address range
>>>>> in terms of having a small number for each I/O space followed by the
>>>>> offset within that space as a 'port number' and uses a mapping
>>>>> function
>>>>> that is defined as
>>>>>
>>>>> static inline void *__ia64_mk_io_addr (unsigned long port)
>>>>> {
>>>>>          struct io_space *space = &io_space[IO_SPACE_NR(port)];
>>>>>          return (space->mmio_base | IO_SPACE_PORT(port););
>>>>> }
>>>>> static inline unsigned int inl(unsigned long port)
>>>>> {
>>>>>          return *__ia64_mk_io_addr(port);
>>>>> }
>>>>>
>>>>> Most architectures allow only one I/O port range and put it at a fixed
>>>>> virtual address so that inl() simply becomes
>>>>>
>>>>> static inline u32 inl(unsigned long addr)
>>>>> {
>>>>>          return readl(PCI_IOBASE + addr);
>>>>> }
>>>>>
>>>>> which noticeably reduces code size.
>>>>>
>>>>> On some architectures (powerpc, arm, arm64), we then get the same
>>>>> simplified
>>>>> definition with a fixed virtual address, and use pci_ioremap_io() or
>>>>> something like that to to map a physical address range into this
>>>>> virtual
>>>>> address window at the correct io_offset;
>>>> Hi all,
>>>>     Thanks for explanation, I found a way to make the ACPI resource
>>>> parsing interface arch neutral, it should help to address Lorenzo's
>>>> concern. Please refer to the attached patch. (It's still RFC, not
>>>> tested
>>>> yet).
>>>
>>> If we go with this approach though, you are not adding the offset to
>>> the resource when parsing the memory spaces in acpi_decode_space(),
>>> are we
>>> sure that's what we really want ?
>>>
>>> In DT, a host bridge range has a:
>>>
>>> - CPU physical address
>>> - PCI bus address
>>>
>>> We use that to compute the offset between primary bus (ie CPU physical
>>> address) and secondary bus (ie PCI bus address).
>>>
>>> The value ending up in the PCI resource struct (for memory space) is
>>> the CPU physical address, if you do not add the offset in
>>> acpi_decode_space
>>> that does not hold true on platforms where CPU<->PCI offset != 0 on
>>> ACPI,
>>> am I wrong ?
>> Hi Lorenzo,
>>     I may have found the divergence between us about the design here. You
>> treat it as a one-stage translation but I treat it as a
>> two-stage translation as below:
>> stage 1: map(translate) per-PCI-domain IO port address[0, 16M) into
>> system global IO port address. Here system global IO port address is
>> ioport_resource[0, IO_SPACE_LIMIT).
>> stage 2: map system IO port address into system memory address.
>>
>> We need two objects of struct resource_win to support above two-stage
>> translation. One object, type of IORESOURCE_IO, is used to support
>> stage one, and it will also used to allocate IO port resources
>> for PCI devices. Another object, type of IORESOURCE_MMIO, is used
>> to allocate resource from iomem_resource and setup MMIO mapping
>> to actually access IO ports.
>>
>> For ARM64, it doesn't support multiple per-PCI-domain(bus local)
>> IO port address space yet, so stage one seems to be optional
>> becomes the offset between bus local IO port address and system
>> IO port address is always 0. But we still need two objects of
>> struct resource_win. The first object is
>>     {
>>         offset:0,
>>         start:AddressMinimum,
>>         end:AddressMaximum,
>>         flags:IORESOURCE_IO
>>     }
>> Here it's type of IORESOURCE_IO and offset must be zero because
>> pcibios_resource_to_bus() will access it translate system IO
>> port address into bus local IO port address. With my patch,
>> the struct resource_win object created by the ACPI core will
>> be reused for this.
>>
>> The second object is:
>>     {
>>         offset:Translation_Offset,
>>         start:AddressMinimum + Translation_Offset,
>>         end:AddressMaximum + Translation_Offset,
>>         flags:IORESOURCE_MMIO
>>     }
>> Arch code need to create the second struct resource_win object
>> and actually setup the MMIO mapping.
>>
>> But there's really another bug need to get fixed, funciton
>> acpi_dev_ioresource_flags() assumes bus local IO port address
>> space is size of 64K, which is wrong for IA64 and ARM64.
>>
> 
> So what would be the Translation_Offset meaning for two cases DWordIo
> (....,TypeTranslation) vs DWordIo (....,TypeStatic)? And why we did not
> use TypeTranslation for IA64 so far?

IA64 actually ignores the translation type flag and just assume it's
TypeTranslation, so there may be some IA64 BIOS implementations
accidentally using TypeStatic. That's why we parsing SparseTranslation
flag without checking TranslationType flag. I feel ARM64 may face the
same situation as IA64:(

We may expect (TypeStatic, 0-offset) and (TypeTranslation,
non-0-offset) in real word. For other two combinations, I haven't
found a real usage yet, though theoretically they are possible.

Thanks,
Gerry

> 
> I am worried that TypeTranslation fall into the IA64 category but ACPI
> tables were already written incorrectly.
> 
> Tomasz
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tomasz Nowicki Nov. 12, 2015, 2:45 p.m. UTC | #8
On 12.11.2015 15:04, Jiang Liu wrote:
> On 2015/11/12 21:21, Tomasz Nowicki wrote:
>> On 12.11.2015 09:43, Jiang Liu wrote:
>>> On 2015/11/12 1:46, Lorenzo Pieralisi wrote:
>>>> On Tue, Nov 10, 2015 at 01:50:46PM +0800, Jiang Liu wrote:
>>>>
>>>> [...]
>>>>
>>>>>>> In particular, I would like to understand, for an eg DWordIO
>>>>>>> descriptor,
>>>>>>> what Range Minimum, Range Maximum and Translation Offset represent,
>>>>>>> they can't mean different things depending on the SW parsing them,
>>>>>>> this totally defeats the purpose.
>>>>>>
>>>>>> I have no clue about what those mean in ACPI though.
>>>>>>
>>>>>> Generally speaking, each PCI domain is expected to have a (normally
>>>>>> 64KB)
>>>>>> range of CPU addresses that gets translated into PCI I/O space the
>>>>>> same
>>>>>> way that config space and memory space are handled.
>>>>>> This is true for almost every architecture except for x86, which uses
>>>>>> different CPU instructions for I/O space compared to the other spaces.
>>>>>>
>>>>>>> By the way, ia64 ioremaps the translation_offset (ie new_space()), so
>>>>>>> basically that's the CPU physical address at which the PCI host
>>>>>>> bridge
>>>>>>> map the IO space transactions), I do not think ia64 is any
>>>>>>> different from
>>>>>>> arm64 in this respect, if it is please provide an HW description
>>>>>>> here from
>>>>>>> the PCI bus perspective here (also an example of ia64 ACPI PCI
>>>>>>> host bridge
>>>>>>> tables would help).
>>>>>>
>>>>>> The main difference between ia64 and a lot of the other
>>>>>> architectures (e.g.
>>>>>> sparc is different again) is that ia64 defines a logical address range
>>>>>> in terms of having a small number for each I/O space followed by the
>>>>>> offset within that space as a 'port number' and uses a mapping
>>>>>> function
>>>>>> that is defined as
>>>>>>
>>>>>> static inline void *__ia64_mk_io_addr (unsigned long port)
>>>>>> {
>>>>>>           struct io_space *space = &io_space[IO_SPACE_NR(port)];
>>>>>>           return (space->mmio_base | IO_SPACE_PORT(port););
>>>>>> }
>>>>>> static inline unsigned int inl(unsigned long port)
>>>>>> {
>>>>>>           return *__ia64_mk_io_addr(port);
>>>>>> }
>>>>>>
>>>>>> Most architectures allow only one I/O port range and put it at a fixed
>>>>>> virtual address so that inl() simply becomes
>>>>>>
>>>>>> static inline u32 inl(unsigned long addr)
>>>>>> {
>>>>>>           return readl(PCI_IOBASE + addr);
>>>>>> }
>>>>>>
>>>>>> which noticeably reduces code size.
>>>>>>
>>>>>> On some architectures (powerpc, arm, arm64), we then get the same
>>>>>> simplified
>>>>>> definition with a fixed virtual address, and use pci_ioremap_io() or
>>>>>> something like that to to map a physical address range into this
>>>>>> virtual
>>>>>> address window at the correct io_offset;
>>>>> Hi all,
>>>>>      Thanks for explanation, I found a way to make the ACPI resource
>>>>> parsing interface arch neutral, it should help to address Lorenzo's
>>>>> concern. Please refer to the attached patch. (It's still RFC, not
>>>>> tested
>>>>> yet).
>>>>
>>>> If we go with this approach though, you are not adding the offset to
>>>> the resource when parsing the memory spaces in acpi_decode_space(),
>>>> are we
>>>> sure that's what we really want ?
>>>>
>>>> In DT, a host bridge range has a:
>>>>
>>>> - CPU physical address
>>>> - PCI bus address
>>>>
>>>> We use that to compute the offset between primary bus (ie CPU physical
>>>> address) and secondary bus (ie PCI bus address).
>>>>
>>>> The value ending up in the PCI resource struct (for memory space) is
>>>> the CPU physical address, if you do not add the offset in
>>>> acpi_decode_space
>>>> that does not hold true on platforms where CPU<->PCI offset != 0 on
>>>> ACPI,
>>>> am I wrong ?
>>> Hi Lorenzo,
>>>      I may have found the divergence between us about the design here. You
>>> treat it as a one-stage translation but I treat it as a
>>> two-stage translation as below:
>>> stage 1: map(translate) per-PCI-domain IO port address[0, 16M) into
>>> system global IO port address. Here system global IO port address is
>>> ioport_resource[0, IO_SPACE_LIMIT).
>>> stage 2: map system IO port address into system memory address.
>>>
>>> We need two objects of struct resource_win to support above two-stage
>>> translation. One object, type of IORESOURCE_IO, is used to support
>>> stage one, and it will also used to allocate IO port resources
>>> for PCI devices. Another object, type of IORESOURCE_MMIO, is used
>>> to allocate resource from iomem_resource and setup MMIO mapping
>>> to actually access IO ports.
>>>
>>> For ARM64, it doesn't support multiple per-PCI-domain(bus local)
>>> IO port address space yet, so stage one seems to be optional
>>> becomes the offset between bus local IO port address and system
>>> IO port address is always 0. But we still need two objects of
>>> struct resource_win. The first object is
>>>      {
>>>          offset:0,
>>>          start:AddressMinimum,
>>>          end:AddressMaximum,
>>>          flags:IORESOURCE_IO
>>>      }
>>> Here it's type of IORESOURCE_IO and offset must be zero because
>>> pcibios_resource_to_bus() will access it translate system IO
>>> port address into bus local IO port address. With my patch,
>>> the struct resource_win object created by the ACPI core will
>>> be reused for this.
>>>
>>> The second object is:
>>>      {
>>>          offset:Translation_Offset,
>>>          start:AddressMinimum + Translation_Offset,
>>>          end:AddressMaximum + Translation_Offset,
>>>          flags:IORESOURCE_MMIO
>>>      }
>>> Arch code need to create the second struct resource_win object
>>> and actually setup the MMIO mapping.
>>>
>>> But there's really another bug need to get fixed, funciton
>>> acpi_dev_ioresource_flags() assumes bus local IO port address
>>> space is size of 64K, which is wrong for IA64 and ARM64.
>>>
>>
>> So what would be the Translation_Offset meaning for two cases DWordIo
>> (....,TypeTranslation) vs DWordIo (....,TypeStatic)? And why we did not
>> use TypeTranslation for IA64 so far?
>
> IA64 actually ignores the translation type flag and just assume it's
> TypeTranslation, so there may be some IA64 BIOS implementations
> accidentally using TypeStatic. That's why we parsing SparseTranslation
> flag without checking TranslationType flag. I feel ARM64 may face the
> same situation as IA64:(
>
> We may expect (TypeStatic, 0-offset) and (TypeTranslation,
> non-0-offset) in real word. For other two combinations, I haven't
> found a real usage yet, though theoretically they are possible.
>

I think we should not bend the generic code for IA64 only and expose 
other platforms to the same issue. Instead, lets interpret spec 
correctly and create IA64 quirk for the sake of backward compatibility. 
Thoughts?

Regards,
Tomasz
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiang Liu Nov. 12, 2015, 3:05 p.m. UTC | #9
On 2015/11/12 22:45, Tomasz Nowicki wrote:
> On 12.11.2015 15:04, Jiang Liu wrote:
>> On 2015/11/12 21:21, Tomasz Nowicki wrote:
>>> On 12.11.2015 09:43, Jiang Liu wrote:
>>>> On 2015/11/12 1:46, Lorenzo Pieralisi wrote:
>>>>> On Tue, Nov 10, 2015 at 01:50:46PM +0800, Jiang Liu wrote:
>>>>>
>>>>> [...]
>>>>>
>>>>>>>> In particular, I would like to understand, for an eg DWordIO
>>>>>>>> descriptor,
>>>>>>>> what Range Minimum, Range Maximum and Translation Offset represent,
>>>>>>>> they can't mean different things depending on the SW parsing them,
>>>>>>>> this totally defeats the purpose.
>>>>>>>
>>>>>>> I have no clue about what those mean in ACPI though.
>>>>>>>
>>>>>>> Generally speaking, each PCI domain is expected to have a (normally
>>>>>>> 64KB)
>>>>>>> range of CPU addresses that gets translated into PCI I/O space the
>>>>>>> same
>>>>>>> way that config space and memory space are handled.
>>>>>>> This is true for almost every architecture except for x86, which
>>>>>>> uses
>>>>>>> different CPU instructions for I/O space compared to the other
>>>>>>> spaces.
>>>>>>>
>>>>>>>> By the way, ia64 ioremaps the translation_offset (ie
>>>>>>>> new_space()), so
>>>>>>>> basically that's the CPU physical address at which the PCI host
>>>>>>>> bridge
>>>>>>>> map the IO space transactions), I do not think ia64 is any
>>>>>>>> different from
>>>>>>>> arm64 in this respect, if it is please provide an HW description
>>>>>>>> here from
>>>>>>>> the PCI bus perspective here (also an example of ia64 ACPI PCI
>>>>>>>> host bridge
>>>>>>>> tables would help).
>>>>>>>
>>>>>>> The main difference between ia64 and a lot of the other
>>>>>>> architectures (e.g.
>>>>>>> sparc is different again) is that ia64 defines a logical address
>>>>>>> range
>>>>>>> in terms of having a small number for each I/O space followed by the
>>>>>>> offset within that space as a 'port number' and uses a mapping
>>>>>>> function
>>>>>>> that is defined as
>>>>>>>
>>>>>>> static inline void *__ia64_mk_io_addr (unsigned long port)
>>>>>>> {
>>>>>>>           struct io_space *space = &io_space[IO_SPACE_NR(port)];
>>>>>>>           return (space->mmio_base | IO_SPACE_PORT(port););
>>>>>>> }
>>>>>>> static inline unsigned int inl(unsigned long port)
>>>>>>> {
>>>>>>>           return *__ia64_mk_io_addr(port);
>>>>>>> }
>>>>>>>
>>>>>>> Most architectures allow only one I/O port range and put it at a
>>>>>>> fixed
>>>>>>> virtual address so that inl() simply becomes
>>>>>>>
>>>>>>> static inline u32 inl(unsigned long addr)
>>>>>>> {
>>>>>>>           return readl(PCI_IOBASE + addr);
>>>>>>> }
>>>>>>>
>>>>>>> which noticeably reduces code size.
>>>>>>>
>>>>>>> On some architectures (powerpc, arm, arm64), we then get the same
>>>>>>> simplified
>>>>>>> definition with a fixed virtual address, and use pci_ioremap_io() or
>>>>>>> something like that to to map a physical address range into this
>>>>>>> virtual
>>>>>>> address window at the correct io_offset;
>>>>>> Hi all,
>>>>>>      Thanks for explanation, I found a way to make the ACPI resource
>>>>>> parsing interface arch neutral, it should help to address Lorenzo's
>>>>>> concern. Please refer to the attached patch. (It's still RFC, not
>>>>>> tested
>>>>>> yet).
>>>>>
>>>>> If we go with this approach though, you are not adding the offset to
>>>>> the resource when parsing the memory spaces in acpi_decode_space(),
>>>>> are we
>>>>> sure that's what we really want ?
>>>>>
>>>>> In DT, a host bridge range has a:
>>>>>
>>>>> - CPU physical address
>>>>> - PCI bus address
>>>>>
>>>>> We use that to compute the offset between primary bus (ie CPU physical
>>>>> address) and secondary bus (ie PCI bus address).
>>>>>
>>>>> The value ending up in the PCI resource struct (for memory space) is
>>>>> the CPU physical address, if you do not add the offset in
>>>>> acpi_decode_space
>>>>> that does not hold true on platforms where CPU<->PCI offset != 0 on
>>>>> ACPI,
>>>>> am I wrong ?
>>>> Hi Lorenzo,
>>>>      I may have found the divergence between us about the design
>>>> here. You
>>>> treat it as a one-stage translation but I treat it as a
>>>> two-stage translation as below:
>>>> stage 1: map(translate) per-PCI-domain IO port address[0, 16M) into
>>>> system global IO port address. Here system global IO port address is
>>>> ioport_resource[0, IO_SPACE_LIMIT).
>>>> stage 2: map system IO port address into system memory address.
>>>>
>>>> We need two objects of struct resource_win to support above two-stage
>>>> translation. One object, type of IORESOURCE_IO, is used to support
>>>> stage one, and it will also used to allocate IO port resources
>>>> for PCI devices. Another object, type of IORESOURCE_MMIO, is used
>>>> to allocate resource from iomem_resource and setup MMIO mapping
>>>> to actually access IO ports.
>>>>
>>>> For ARM64, it doesn't support multiple per-PCI-domain(bus local)
>>>> IO port address space yet, so stage one seems to be optional
>>>> becomes the offset between bus local IO port address and system
>>>> IO port address is always 0. But we still need two objects of
>>>> struct resource_win. The first object is
>>>>      {
>>>>          offset:0,
>>>>          start:AddressMinimum,
>>>>          end:AddressMaximum,
>>>>          flags:IORESOURCE_IO
>>>>      }
>>>> Here it's type of IORESOURCE_IO and offset must be zero because
>>>> pcibios_resource_to_bus() will access it translate system IO
>>>> port address into bus local IO port address. With my patch,
>>>> the struct resource_win object created by the ACPI core will
>>>> be reused for this.
>>>>
>>>> The second object is:
>>>>      {
>>>>          offset:Translation_Offset,
>>>>          start:AddressMinimum + Translation_Offset,
>>>>          end:AddressMaximum + Translation_Offset,
>>>>          flags:IORESOURCE_MMIO
>>>>      }
>>>> Arch code need to create the second struct resource_win object
>>>> and actually setup the MMIO mapping.
>>>>
>>>> But there's really another bug need to get fixed, funciton
>>>> acpi_dev_ioresource_flags() assumes bus local IO port address
>>>> space is size of 64K, which is wrong for IA64 and ARM64.
>>>>
>>>
>>> So what would be the Translation_Offset meaning for two cases DWordIo
>>> (....,TypeTranslation) vs DWordIo (....,TypeStatic)? And why we did not
>>> use TypeTranslation for IA64 so far?
>>
>> IA64 actually ignores the translation type flag and just assume it's
>> TypeTranslation, so there may be some IA64 BIOS implementations
>> accidentally using TypeStatic. That's why we parsing SparseTranslation
>> flag without checking TranslationType flag. I feel ARM64 may face the
>> same situation as IA64:(
>>
>> We may expect (TypeStatic, 0-offset) and (TypeTranslation,
>> non-0-offset) in real word. For other two combinations, I haven't
>> found a real usage yet, though theoretically they are possible.
>>
> 
> I think we should not bend the generic code for IA64 only and expose
> other platforms to the same issue. Instead, lets interpret spec
> correctly and create IA64 quirk for the sake of backward compatibility.
> Thoughts?
I think there are at least two factors related to this issue.

First we still lack of a way/framework to fix errors in ACPI resource
descriptors. Recently we have refined ACPI resource parsing interfaces
and enforced strictly sanity check. This brings us some regressions
which are really BIOS flaws, but it used to work and now breaks:(
I'm still struggling to get those regressions fixed. So we may run
into the same situation if we enforce strict check for TranslationType:(

Second enforcing strict check doesn't bring us too much benifits.
Translation type is almost platform specific, and we haven't found a
platform support both TypeTranslation and TypeStatic, so arch code
may assume the correct translation type no matter what BIOS reports.
So it won't hurt us even BIOS reports wrong translation type.

So I'm tending to keep current implementation with looser checking,
otherwise it may cause regressions.
Thanks,
Gerry

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tomasz Nowicki Nov. 13, 2015, 12:57 p.m. UTC | #10
On 12.11.2015 16:05, Jiang Liu wrote:
> On 2015/11/12 22:45, Tomasz Nowicki wrote:
>> On 12.11.2015 15:04, Jiang Liu wrote:
>>> On 2015/11/12 21:21, Tomasz Nowicki wrote:
>>>> On 12.11.2015 09:43, Jiang Liu wrote:
>>>>> On 2015/11/12 1:46, Lorenzo Pieralisi wrote:
>>>>>> On Tue, Nov 10, 2015 at 01:50:46PM +0800, Jiang Liu wrote:
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>>>>> In particular, I would like to understand, for an eg DWordIO
>>>>>>>>> descriptor,
>>>>>>>>> what Range Minimum, Range Maximum and Translation Offset represent,
>>>>>>>>> they can't mean different things depending on the SW parsing them,
>>>>>>>>> this totally defeats the purpose.
>>>>>>>>
>>>>>>>> I have no clue about what those mean in ACPI though.
>>>>>>>>
>>>>>>>> Generally speaking, each PCI domain is expected to have a (normally
>>>>>>>> 64KB)
>>>>>>>> range of CPU addresses that gets translated into PCI I/O space the
>>>>>>>> same
>>>>>>>> way that config space and memory space are handled.
>>>>>>>> This is true for almost every architecture except for x86, which
>>>>>>>> uses
>>>>>>>> different CPU instructions for I/O space compared to the other
>>>>>>>> spaces.
>>>>>>>>
>>>>>>>>> By the way, ia64 ioremaps the translation_offset (ie
>>>>>>>>> new_space()), so
>>>>>>>>> basically that's the CPU physical address at which the PCI host
>>>>>>>>> bridge
>>>>>>>>> map the IO space transactions), I do not think ia64 is any
>>>>>>>>> different from
>>>>>>>>> arm64 in this respect, if it is please provide an HW description
>>>>>>>>> here from
>>>>>>>>> the PCI bus perspective here (also an example of ia64 ACPI PCI
>>>>>>>>> host bridge
>>>>>>>>> tables would help).
>>>>>>>>
>>>>>>>> The main difference between ia64 and a lot of the other
>>>>>>>> architectures (e.g.
>>>>>>>> sparc is different again) is that ia64 defines a logical address
>>>>>>>> range
>>>>>>>> in terms of having a small number for each I/O space followed by the
>>>>>>>> offset within that space as a 'port number' and uses a mapping
>>>>>>>> function
>>>>>>>> that is defined as
>>>>>>>>
>>>>>>>> static inline void *__ia64_mk_io_addr (unsigned long port)
>>>>>>>> {
>>>>>>>>            struct io_space *space = &io_space[IO_SPACE_NR(port)];
>>>>>>>>            return (space->mmio_base | IO_SPACE_PORT(port););
>>>>>>>> }
>>>>>>>> static inline unsigned int inl(unsigned long port)
>>>>>>>> {
>>>>>>>>            return *__ia64_mk_io_addr(port);
>>>>>>>> }
>>>>>>>>
>>>>>>>> Most architectures allow only one I/O port range and put it at a
>>>>>>>> fixed
>>>>>>>> virtual address so that inl() simply becomes
>>>>>>>>
>>>>>>>> static inline u32 inl(unsigned long addr)
>>>>>>>> {
>>>>>>>>            return readl(PCI_IOBASE + addr);
>>>>>>>> }
>>>>>>>>
>>>>>>>> which noticeably reduces code size.
>>>>>>>>
>>>>>>>> On some architectures (powerpc, arm, arm64), we then get the same
>>>>>>>> simplified
>>>>>>>> definition with a fixed virtual address, and use pci_ioremap_io() or
>>>>>>>> something like that to to map a physical address range into this
>>>>>>>> virtual
>>>>>>>> address window at the correct io_offset;
>>>>>>> Hi all,
>>>>>>>       Thanks for explanation, I found a way to make the ACPI resource
>>>>>>> parsing interface arch neutral, it should help to address Lorenzo's
>>>>>>> concern. Please refer to the attached patch. (It's still RFC, not
>>>>>>> tested
>>>>>>> yet).
>>>>>>
>>>>>> If we go with this approach though, you are not adding the offset to
>>>>>> the resource when parsing the memory spaces in acpi_decode_space(),
>>>>>> are we
>>>>>> sure that's what we really want ?
>>>>>>
>>>>>> In DT, a host bridge range has a:
>>>>>>
>>>>>> - CPU physical address
>>>>>> - PCI bus address
>>>>>>
>>>>>> We use that to compute the offset between primary bus (ie CPU physical
>>>>>> address) and secondary bus (ie PCI bus address).
>>>>>>
>>>>>> The value ending up in the PCI resource struct (for memory space) is
>>>>>> the CPU physical address, if you do not add the offset in
>>>>>> acpi_decode_space
>>>>>> that does not hold true on platforms where CPU<->PCI offset != 0 on
>>>>>> ACPI,
>>>>>> am I wrong ?
>>>>> Hi Lorenzo,
>>>>>       I may have found the divergence between us about the design
>>>>> here. You
>>>>> treat it as a one-stage translation but I treat it as a
>>>>> two-stage translation as below:
>>>>> stage 1: map(translate) per-PCI-domain IO port address[0, 16M) into
>>>>> system global IO port address. Here system global IO port address is
>>>>> ioport_resource[0, IO_SPACE_LIMIT).
>>>>> stage 2: map system IO port address into system memory address.
>>>>>
>>>>> We need two objects of struct resource_win to support above two-stage
>>>>> translation. One object, type of IORESOURCE_IO, is used to support
>>>>> stage one, and it will also used to allocate IO port resources
>>>>> for PCI devices. Another object, type of IORESOURCE_MMIO, is used
>>>>> to allocate resource from iomem_resource and setup MMIO mapping
>>>>> to actually access IO ports.
>>>>>
>>>>> For ARM64, it doesn't support multiple per-PCI-domain(bus local)
>>>>> IO port address space yet, so stage one seems to be optional
>>>>> becomes the offset between bus local IO port address and system
>>>>> IO port address is always 0. But we still need two objects of
>>>>> struct resource_win. The first object is
>>>>>       {
>>>>>           offset:0,
>>>>>           start:AddressMinimum,
>>>>>           end:AddressMaximum,
>>>>>           flags:IORESOURCE_IO
>>>>>       }
>>>>> Here it's type of IORESOURCE_IO and offset must be zero because
>>>>> pcibios_resource_to_bus() will access it translate system IO
>>>>> port address into bus local IO port address. With my patch,
>>>>> the struct resource_win object created by the ACPI core will
>>>>> be reused for this.
>>>>>
>>>>> The second object is:
>>>>>       {
>>>>>           offset:Translation_Offset,
>>>>>           start:AddressMinimum + Translation_Offset,
>>>>>           end:AddressMaximum + Translation_Offset,
>>>>>           flags:IORESOURCE_MMIO
>>>>>       }
>>>>> Arch code need to create the second struct resource_win object
>>>>> and actually setup the MMIO mapping.
>>>>>
>>>>> But there's really another bug need to get fixed, funciton
>>>>> acpi_dev_ioresource_flags() assumes bus local IO port address
>>>>> space is size of 64K, which is wrong for IA64 and ARM64.
>>>>>
>>>>
>>>> So what would be the Translation_Offset meaning for two cases DWordIo
>>>> (....,TypeTranslation) vs DWordIo (....,TypeStatic)? And why we did not
>>>> use TypeTranslation for IA64 so far?
>>>
>>> IA64 actually ignores the translation type flag and just assume it's
>>> TypeTranslation, so there may be some IA64 BIOS implementations
>>> accidentally using TypeStatic. That's why we parsing SparseTranslation
>>> flag without checking TranslationType flag. I feel ARM64 may face the
>>> same situation as IA64:(
>>>
>>> We may expect (TypeStatic, 0-offset) and (TypeTranslation,
>>> non-0-offset) in real word. For other two combinations, I haven't
>>> found a real usage yet, though theoretically they are possible.
>>>
>>
>> I think we should not bend the generic code for IA64 only and expose
>> other platforms to the same issue. Instead, lets interpret spec
>> correctly and create IA64 quirk for the sake of backward compatibility.
>> Thoughts?
> I think there are at least two factors related to this issue.
>
> First we still lack of a way/framework to fix errors in ACPI resource
> descriptors. Recently we have refined ACPI resource parsing interfaces
> and enforced strictly sanity check. This brings us some regressions
> which are really BIOS flaws, but it used to work and now breaks:(
> I'm still struggling to get those regressions fixed. So we may run
> into the same situation if we enforce strict check for TranslationType:(
>
> Second enforcing strict check doesn't bring us too much benifits.
> Translation type is almost platform specific, and we haven't found a
> platform support both TypeTranslation and TypeStatic, so arch code
> may assume the correct translation type no matter what BIOS reports.
> So it won't hurt us even BIOS reports wrong translation type.
>

That is my point, lets pass down all we need from resource range 
descriptors to arch code, then archs with known quirks can whatever is 
needed to make it works. However, generic code like acpi_decode_space 
cannot play with offsets with silent IA64 assumption.

To sum it up, your last patch looks ok to me modulo Lorenzo's concern:
 >>>>>> If we go with this approach though, you are not adding the offset to
 >>>>>> the resource when parsing the memory spaces in acpi_decode_space(),
 >>>>>> are we
 >>>>>> sure that's what we really want ?
 >>>>>>
 >>>>>> In DT, a host bridge range has a:
 >>>>>>
 >>>>>> - CPU physical address
 >>>>>> - PCI bus address
 >>>>>>
 >>>>>> We use that to compute the offset between primary bus (ie CPU 
physical
 >>>>>> address) and secondary bus (ie PCI bus address).
 >>>>>>
 >>>>>> The value ending up in the PCI resource struct (for memory space) is
 >>>>>> the CPU physical address, if you do not add the offset in
 >>>>>> acpi_decode_space
 >>>>>> that does not hold true on platforms where CPU<->PCI offset != 0 on
 >>>>>> ACPI,
 >>>>>> am I wrong ?
His concern is that your patch will cause:
acpi_pci_root_validate_resources(&device->dev, list,
				 IORESOURCE_MEM);
to fail now.

Regards,
Tomasz
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Pieralisi Nov. 13, 2015, 5:03 p.m. UTC | #11
Please trim your emails, thanks.

On Fri, Nov 13, 2015 at 01:57:30PM +0100, Tomasz Nowicki wrote:
> On 12.11.2015 16:05, Jiang Liu wrote:

[...]

> >>>IA64 actually ignores the translation type flag and just assume it's
> >>>TypeTranslation, so there may be some IA64 BIOS implementations
> >>>accidentally using TypeStatic. That's why we parsing SparseTranslation
> >>>flag without checking TranslationType flag. I feel ARM64 may face the
> >>>same situation as IA64:(
> >>>
> >>>We may expect (TypeStatic, 0-offset) and (TypeTranslation,
> >>>non-0-offset) in real word. For other two combinations, I haven't
> >>>found a real usage yet, though theoretically they are possible.

I do not understand why (TypeStatic, non-0-offset) is not a valid
option. Aren't there any (x86) platforms with a CPU<->PCI _physical_
address space offset out there (I am talking about memory space) ?

> >>I think we should not bend the generic code for IA64 only and expose
> >>other platforms to the same issue. Instead, lets interpret spec
> >>correctly and create IA64 quirk for the sake of backward compatibility.
> >>Thoughts?
> >I think there are at least two factors related to this issue.
> >
> >First we still lack of a way/framework to fix errors in ACPI resource
> >descriptors. Recently we have refined ACPI resource parsing interfaces
> >and enforced strictly sanity check. This brings us some regressions
> >which are really BIOS flaws, but it used to work and now breaks:(
> >I'm still struggling to get those regressions fixed. So we may run
> >into the same situation if we enforce strict check for TranslationType:(
> >
> >Second enforcing strict check doesn't bring us too much benifits.
> >Translation type is almost platform specific, and we haven't found a
> >platform support both TypeTranslation and TypeStatic, so arch code
> >may assume the correct translation type no matter what BIOS reports.
> >So it won't hurt us even BIOS reports wrong translation type.

TBH I still do not understand what TranslationType actually means,
I will ask whoever added that to the specification to understand it.

> That is my point, lets pass down all we need from resource range
> descriptors to arch code, then archs with known quirks can whatever
> is needed to make it works. However, generic code like
> acpi_decode_space cannot play with offsets with silent IA64
> assumption.
> 
> To sum it up, your last patch looks ok to me modulo Lorenzo's concern:
> >>>>>> If we go with this approach though, you are not adding the offset to
> >>>>>> the resource when parsing the memory spaces in acpi_decode_space(),
> >>>>>> are we
> >>>>>> sure that's what we really want ?
> >>>>>>
> >>>>>> In DT, a host bridge range has a:
> >>>>>>
> >>>>>> - CPU physical address
> >>>>>> - PCI bus address
> >>>>>>
> >>>>>> We use that to compute the offset between primary bus (ie CPU
> physical
> >>>>>> address) and secondary bus (ie PCI bus address).
> >>>>>>
> >>>>>> The value ending up in the PCI resource struct (for memory space) is
> >>>>>> the CPU physical address, if you do not add the offset in
> >>>>>> acpi_decode_space
> >>>>>> that does not hold true on platforms where CPU<->PCI offset != 0 on
> >>>>>> ACPI,
> >>>>>> am I wrong ?
> His concern is that your patch will cause:
> acpi_pci_root_validate_resources(&device->dev, list,
> 				 IORESOURCE_MEM);
> to fail now.

Not really. My concern is that there might be platforms out there with
an offset between the CPU and PCI physical address spaces, and if we
remove the offset value in acpi_decode_space we can break them,
because in the kernel struct resource data we have to have CPU physical
addresses, not PCI ones. If offset == 0, we are home and dry, I do not
understand why that's a given, which is what we would assume if Jiang's
patch is merged as-is unless I am mistaken.

Thanks,
Lorenzo
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiang Liu Nov. 13, 2015, 5:49 p.m. UTC | #12
On 2015/11/14 1:03, Lorenzo Pieralisi wrote:
> Please trim your emails, thanks.
> 
> On Fri, Nov 13, 2015 at 01:57:30PM +0100, Tomasz Nowicki wrote:
>> On 12.11.2015 16:05, Jiang Liu wrote:
> 
> [...]
> 
>>>>> IA64 actually ignores the translation type flag and just assume it's
>>>>> TypeTranslation, so there may be some IA64 BIOS implementations
>>>>> accidentally using TypeStatic. That's why we parsing SparseTranslation
>>>>> flag without checking TranslationType flag. I feel ARM64 may face the
>>>>> same situation as IA64:(
>>>>>
>>>>> We may expect (TypeStatic, 0-offset) and (TypeTranslation,
>>>>> non-0-offset) in real word. For other two combinations, I haven't
>>>>> found a real usage yet, though theoretically they are possible.
> 
> I do not understand why (TypeStatic, non-0-offset) is not a valid
> option. Aren't there any (x86) platforms with a CPU<->PCI _physical_
> address space offset out there (I am talking about memory space) ?

It's possible, but we have found such a design yet. If we eventually
encounter such a case, we need to enhance x86 specific code to support
it.

> 
>>>> I think we should not bend the generic code for IA64 only and expose
>>>> other platforms to the same issue. Instead, lets interpret spec
>>>> correctly and create IA64 quirk for the sake of backward compatibility.
>>>> Thoughts?
>>> I think there are at least two factors related to this issue.
>>>
>>> First we still lack of a way/framework to fix errors in ACPI resource
>>> descriptors. Recently we have refined ACPI resource parsing interfaces
>>> and enforced strictly sanity check. This brings us some regressions
>>> which are really BIOS flaws, but it used to work and now breaks:(
>>> I'm still struggling to get those regressions fixed. So we may run
>>> into the same situation if we enforce strict check for TranslationType:(
>>>
>>> Second enforcing strict check doesn't bring us too much benifits.
>>> Translation type is almost platform specific, and we haven't found a
>>> platform support both TypeTranslation and TypeStatic, so arch code
>>> may assume the correct translation type no matter what BIOS reports.
>>> So it won't hurt us even BIOS reports wrong translation type.
> 
> TBH I still do not understand what TranslationType actually means,
> I will ask whoever added that to the specification to understand it.
> 
>> That is my point, lets pass down all we need from resource range
>> descriptors to arch code, then archs with known quirks can whatever
>> is needed to make it works. However, generic code like
>> acpi_decode_space cannot play with offsets with silent IA64
>> assumption.
>>
>> To sum it up, your last patch looks ok to me modulo Lorenzo's concern:
>>>>>>>> If we go with this approach though, you are not adding the offset to
>>>>>>>> the resource when parsing the memory spaces in acpi_decode_space(),
>>>>>>>> are we
>>>>>>>> sure that's what we really want ?
>>>>>>>>
>>>>>>>> In DT, a host bridge range has a:
>>>>>>>>
>>>>>>>> - CPU physical address
>>>>>>>> - PCI bus address
>>>>>>>>
>>>>>>>> We use that to compute the offset between primary bus (ie CPU
>> physical
>>>>>>>> address) and secondary bus (ie PCI bus address).
>>>>>>>>
>>>>>>>> The value ending up in the PCI resource struct (for memory space) is
>>>>>>>> the CPU physical address, if you do not add the offset in
>>>>>>>> acpi_decode_space
>>>>>>>> that does not hold true on platforms where CPU<->PCI offset != 0 on
>>>>>>>> ACPI,
>>>>>>>> am I wrong ?
>> His concern is that your patch will cause:
>> acpi_pci_root_validate_resources(&device->dev, list,
>> 				 IORESOURCE_MEM);
>> to fail now.
> 
> Not really. My concern is that there might be platforms out there with
> an offset between the CPU and PCI physical address spaces, and if we
> remove the offset value in acpi_decode_space we can break them,
> because in the kernel struct resource data we have to have CPU physical
> addresses, not PCI ones. If offset == 0, we are home and dry, I do not
> understand why that's a given, which is what we would assume if Jiang's
> patch is merged as-is unless I am mistaken.
We try to exclude offset from struct resource in generic ACPI code,
and it's the arch's responsibility to decide how to manipulate struct
resource object if offset is not zero.

Currently offset is always zero for x86, and IA64 has arch specific
code to handle non-zero offset. So we should be safe without breaking
existing code. For ARM64, it's a little different from IA64 so it's
hard to share code between IA64 and ARM64.

> 
> Thanks,
> Lorenzo
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Pieralisi Nov. 20, 2015, 10:18 a.m. UTC | #13
Hi Jiang,

On Sat, Nov 14, 2015 at 01:49:08AM +0800, Jiang Liu wrote:

[...]

> > Not really. My concern is that there might be platforms out there with
> > an offset between the CPU and PCI physical address spaces, and if we
> > remove the offset value in acpi_decode_space we can break them,
> > because in the kernel struct resource data we have to have CPU physical
> > addresses, not PCI ones. If offset == 0, we are home and dry, I do not
> > understand why that's a given, which is what we would assume if Jiang's
> > patch is merged as-is unless I am mistaken.
> We try to exclude offset from struct resource in generic ACPI code,
> and it's the arch's responsibility to decide how to manipulate struct
> resource object if offset is not zero.
> 
> Currently offset is always zero for x86, and IA64 has arch specific
> code to handle non-zero offset. So we should be safe without breaking
> existing code. For ARM64, it's a little different from IA64 so it's
> hard to share code between IA64 and ARM64.

Can you drop the patch on the mailing lists please, we actually need it
to get ACPI ARM64 PCIe support in the kernel, please let me know how
you want to handle this and if you need any help.

Thanks,
Lorenzo
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tomasz Nowicki Nov. 27, 2015, 6:59 a.m. UTC | #14
On 20.11.2015 11:18, Lorenzo Pieralisi wrote:
> Hi Jiang,
>
> On Sat, Nov 14, 2015 at 01:49:08AM +0800, Jiang Liu wrote:
>
> [...]
>
>>> Not really. My concern is that there might be platforms out there with
>>> an offset between the CPU and PCI physical address spaces, and if we
>>> remove the offset value in acpi_decode_space we can break them,
>>> because in the kernel struct resource data we have to have CPU physical
>>> addresses, not PCI ones. If offset == 0, we are home and dry, I do not
>>> understand why that's a given, which is what we would assume if Jiang's
>>> patch is merged as-is unless I am mistaken.
>> We try to exclude offset from struct resource in generic ACPI code,
>> and it's the arch's responsibility to decide how to manipulate struct
>> resource object if offset is not zero.
>>
>> Currently offset is always zero for x86, and IA64 has arch specific
>> code to handle non-zero offset. So we should be safe without breaking
>> existing code. For ARM64, it's a little different from IA64 so it's
>> hard to share code between IA64 and ARM64.
>
> Can you drop the patch on the mailing lists please, we actually need it
> to get ACPI ARM64 PCIe support in the kernel, please let me know how
> you want to handle this and if you need any help.
>

Kindly reminder.

Regards,
Tomasz
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c
index 8f6ac2f8ae4c..ee84300797d8 100644
--- a/arch/ia64/pci/pci.c
+++ b/arch/ia64/pci/pci.c
@@ -154,7 +154,7 @@  static int add_io_space(struct device *dev, struct pci_root_info *info,
 	struct resource_entry *iospace;
 	struct resource *resource, *res = entry->res;
 	char *name;
-	unsigned long base, min, max, base_port;
+	unsigned long base_mmio, base_port;
 	unsigned int sparse = 0, space_nr, len;
 
 	len = strlen(info->common.name) + 32;
@@ -172,12 +172,10 @@  static int add_io_space(struct device *dev, struct pci_root_info *info,
 		goto free_resource;
 
 	name = (char *)(iospace + 1);
-	min = res->start - entry->offset;
-	max = res->end - entry->offset;
-	base = __pa(io_space[space_nr].mmio_base);
+	base_mmio = __pa(io_space[space_nr].mmio_base);
 	base_port = IO_SPACE_BASE(space_nr);
 	snprintf(name, len, "%s I/O Ports %08lx-%08lx", info->common.name,
-		 base_port + min, base_port + max);
+		 base_port + res->start, base_port + res->end);
 
 	/*
 	 * The SDM guarantees the legacy 0-64K space is sparse, but if the
@@ -190,19 +188,27 @@  static int add_io_space(struct device *dev, struct pci_root_info *info,
 	resource = iospace->res;
 	resource->name  = name;
 	resource->flags = IORESOURCE_MEM;
-	resource->start = base + (sparse ? IO_SPACE_SPARSE_ENCODING(min) : min);
-	resource->end   = base + (sparse ? IO_SPACE_SPARSE_ENCODING(max) : max);
+	resource->start = base_mmio;
+	resource->end = base_mmio;
+	if (sparse) {
+		resource->start += IO_SPACE_SPARSE_ENCODING(res->start);
+		resource->end += IO_SPACE_SPARSE_ENCODING(res->end);
+	} else {
+		resource->start += res->start;
+		resource->end += res->end;
+	}
 	if (insert_resource(&iomem_resource, resource)) {
 		dev_err(dev,
 			"can't allocate host bridge io space resource  %pR\n",
 			resource);
 		goto free_resource;
 	}
+	resource_list_add_tail(iospace, &info->io_resources);
 
+	/* Adjust base of original IO port resource descriptor */
 	entry->offset = base_port;
-	res->start = min + base_port;
-	res->end = max + base_port;
-	resource_list_add_tail(iospace, &info->io_resources);
+	res->start += base_port;
+	res->end += base_port;
 
 	return 0;
 
diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c
index cdc5c2599beb..6578f68b17be 100644
--- a/drivers/acpi/resource.c
+++ b/drivers/acpi/resource.c
@@ -190,8 +190,7 @@  static bool acpi_decode_space(struct resource_win *win,
 {
 	u8 iodec = attr->granularity == 0xfff ? ACPI_DECODE_10 : ACPI_DECODE_16;
 	bool wp = addr->info.mem.write_protect;
-	u64 len = attr->address_length;
-	u64 start, end, offset = 0;
+	u64 len = attr->address_length, offset = 0;
 	struct resource *res = &win->res;
 
 	/*
@@ -215,14 +214,13 @@  static bool acpi_decode_space(struct resource_win *win,
 	else if (attr->translation_offset)
 		pr_debug("ACPI: translation_offset(%lld) is invalid for non-bridge device.\n",
 			 attr->translation_offset);
-	start = attr->minimum + offset;
-	end = attr->maximum + offset;
 
 	win->offset = offset;
-	res->start = start;
-	res->end = end;
+	res->start = attr->minimum;
+	res->end = attr->maximum;
 	if (sizeof(resource_size_t) < sizeof(u64) &&
-	    (offset != win->offset || start != res->start || end != res->end)) {
+	    (offset != win->offset || attr->minimum != res->start ||
+	     attr->maximum != res->end)) {
 		pr_warn("acpi resource window ([%#llx-%#llx] ignored, not CPU addressable)\n",
 			attr->minimum, attr->maximum);
 		return false;