Message ID | 38dc0bd6c1350568f4978dba3373249753944a61.1471434672.git.agordeev@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi, Alex, I see that we are adding ARM tests in the series as well, so I am thinking whether this work is a prepare work to test ARM SMMU? Also, some comments below... On Wed, Aug 17, 2016 at 02:07:07PM +0200, Alexander Gordeev wrote: [...] > -unsigned long pci_bar_addr(pcidevaddr_t dev, int bar_num) > +phys_addr_t pci_bar_addr(pcidevaddr_t dev, int bar_num) > { > uint32_t bar = pci_bar_get(dev, bar_num); > + uint32_t mask = pci_bar_mask(bar); > + uint64_t addr = bar & mask; > > - if (bar & PCI_BASE_ADDRESS_SPACE_IO) > - return bar & PCI_BASE_ADDRESS_IO_MASK; > - else > - return bar & PCI_BASE_ADDRESS_MEM_MASK; > + if (pci_bar_is64(dev, bar_num)) > + addr |= (uint64_t)pci_bar_get(dev, bar_num + 1) << 32; > + > + return pci_translate_addr(dev, addr); Raw question: do we need to translate bar addresses as well? [...] > +static inline > +phys_addr_t pci_translate_addr(pcidevaddr_t dev __unused, uint64_t addr) > +{ > + return addr; > +} > + We are not using dev here (and in following patches), I don't know ARM, but at least x86 will need this to translate IOVA into PA (in other words, each device can have its own IO address space). If this codes are for common, shall we consider that as well? Thanks. -- peterx -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Sep 23, 2016 at 03:14:04PM +0800, Peter Xu wrote: > Hi, Alex, > > I see that we are adding ARM tests in the series as well, so I am > thinking whether this work is a prepare work to test ARM SMMU? That would be excellent. Adding Eric, he may be able to help here. drew > > Also, some comments below... > > On Wed, Aug 17, 2016 at 02:07:07PM +0200, Alexander Gordeev wrote: > > [...] > > > -unsigned long pci_bar_addr(pcidevaddr_t dev, int bar_num) > > +phys_addr_t pci_bar_addr(pcidevaddr_t dev, int bar_num) > > { > > uint32_t bar = pci_bar_get(dev, bar_num); > > + uint32_t mask = pci_bar_mask(bar); > > + uint64_t addr = bar & mask; > > > > - if (bar & PCI_BASE_ADDRESS_SPACE_IO) > > - return bar & PCI_BASE_ADDRESS_IO_MASK; > > - else > > - return bar & PCI_BASE_ADDRESS_MEM_MASK; > > + if (pci_bar_is64(dev, bar_num)) > > + addr |= (uint64_t)pci_bar_get(dev, bar_num + 1) << 32; > > + > > + return pci_translate_addr(dev, addr); > > Raw question: do we need to translate bar addresses as well? > > [...] > > > +static inline > > +phys_addr_t pci_translate_addr(pcidevaddr_t dev __unused, uint64_t addr) > > +{ > > + return addr; > > +} > > + > > We are not using dev here (and in following patches), I don't know > ARM, but at least x86 will need this to translate IOVA into PA (in > other words, each device can have its own IO address space). If this > codes are for common, shall we consider that as well? > > Thanks. > > -- peterx > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Sep 23, 2016 at 10:51:06AM +0200, Andrew Jones wrote: > On Fri, Sep 23, 2016 at 03:14:04PM +0800, Peter Xu wrote: > > Hi, Alex, > > > > I see that we are adding ARM tests in the series as well, so I am > > thinking whether this work is a prepare work to test ARM SMMU? > > That would be excellent. Adding Eric, he may be able to help here. I should say "Alex" rather than "we". :) And I posted this question since I see that Alex is adding addr translation logic to PCI. That's what I am trying to do for unit tests for Intel IOMMU devices. So I asked. Though (as mentioned in the comment) I don't know whether PCI bar needs translation. I thought only DMA requests are required to be translated in device address space. Please correct if I am wrong. Thanks, -- peterx -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Sep 23, 2016 at 03:14:04PM +0800, Peter Xu wrote: > > -unsigned long pci_bar_addr(pcidevaddr_t dev, int bar_num) > > +phys_addr_t pci_bar_addr(pcidevaddr_t dev, int bar_num) > > { > > uint32_t bar = pci_bar_get(dev, bar_num); > > + uint32_t mask = pci_bar_mask(bar); > > + uint64_t addr = bar & mask; > > > > - if (bar & PCI_BASE_ADDRESS_SPACE_IO) > > - return bar & PCI_BASE_ADDRESS_IO_MASK; > > - else > > - return bar & PCI_BASE_ADDRESS_MEM_MASK; > > + if (pci_bar_is64(dev, bar_num)) > > + addr |= (uint64_t)pci_bar_get(dev, bar_num + 1) << 32; > > + > > + return pci_translate_addr(dev, addr); > > Raw question: do we need to translate bar addresses as well? I believe, yes. Unless we always have identity mapping between PCI address space and CPU physical address space I can not realize how could it be done otherwise. But even if we were, I would leave the translation routine for clarity. > [...] > > > +static inline > > +phys_addr_t pci_translate_addr(pcidevaddr_t dev __unused, uint64_t addr) > > +{ > > + return addr; > > +} > > + > > We are not using dev here (and in following patches), I don't know > ARM, but at least x86 will need this to translate IOVA into PA (in > other words, each device can have its own IO address space). If this > codes are for common, shall we consider that as well? Could this function be extended or the prototype blocks the proper implementation? (I guess, I will get the answer once I look into your works). > Thanks. > > -- peterx -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 12, 2016 at 04:37:54PM +0200, Alexander Gordeev wrote: > On Fri, Sep 23, 2016 at 03:14:04PM +0800, Peter Xu wrote: > > > -unsigned long pci_bar_addr(pcidevaddr_t dev, int bar_num) > > > +phys_addr_t pci_bar_addr(pcidevaddr_t dev, int bar_num) > > > { > > > uint32_t bar = pci_bar_get(dev, bar_num); > > > + uint32_t mask = pci_bar_mask(bar); > > > + uint64_t addr = bar & mask; > > > > > > - if (bar & PCI_BASE_ADDRESS_SPACE_IO) > > > - return bar & PCI_BASE_ADDRESS_IO_MASK; > > > - else > > > - return bar & PCI_BASE_ADDRESS_MEM_MASK; > > > + if (pci_bar_is64(dev, bar_num)) > > > + addr |= (uint64_t)pci_bar_get(dev, bar_num + 1) << 32; > > > + > > > + return pci_translate_addr(dev, addr); > > > > Raw question: do we need to translate bar addresses as well? > > I believe, yes. > > Unless we always have identity mapping between PCI address space and > CPU physical address space I can not realize how could it be done > otherwise. But even if we were, I would leave the translation routine > for clarity. Sorry I didn't quite catch your point. Are we talking about IOMMU address remapping here? IMHO BAR addresses are from CPU's point of view. It's only used by CPU, not device. In that case, BAR address should not be translated at least by IOMMU (no matter for x86/arm or whatever). Take Linux as example: pci_ioremap_bar() is responsible to be called for any PCI drivers to map device memory bars into kernel virtual address space. Basically it does: void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar) { struct resource *res = &pdev->resource[bar]; return ioremap_nocache(res->start, resource_size(res)); } So as it is written: I believe we don't translate the bar address (which should be res->start). We use it as physical address. Or, do you mean other kinds of translation that I don't aware of? Another silly question: I see pci_translate_addr() defined in both lib/x86/asm/pci.h and lib/asm-generic/pci-host-bridge.h. How's that working? > > > [...] > > > > > +static inline > > > +phys_addr_t pci_translate_addr(pcidevaddr_t dev __unused, uint64_t addr) > > > +{ > > > + return addr; > > > +} > > > + > > > > We are not using dev here (and in following patches), I don't know > > ARM, but at least x86 will need this to translate IOVA into PA (in > > other words, each device can have its own IO address space). If this > > codes are for common, shall we consider that as well? > > Could this function be extended or the prototype blocks the proper > implementation? (I guess, I will get the answer once I look into your > works). Yes, it can be extended. Thanks, -- peterx -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 13, 2016 at 02:40:35PM +0800, Peter Xu wrote: > > > > + return pci_translate_addr(dev, addr); > > > > > > Raw question: do we need to translate bar addresses as well? > > > > I believe, yes. > > > > Unless we always have identity mapping between PCI address space and > > CPU physical address space I can not realize how could it be done > > otherwise. But even if we were, I would leave the translation routine > > for clarity. > > Sorry I didn't quite catch your point. Are we talking about IOMMU > address remapping here? IMHO BAR addresses are from CPU's point of > view. It's only used by CPU, not device. In that case, BAR address > should not be translated at least by IOMMU (no matter for x86/arm or > whatever). > > Take Linux as example: pci_ioremap_bar() is responsible to be called > for any PCI drivers to map device memory bars into kernel virtual > address space. Basically it does: > > void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar) > { > struct resource *res = &pdev->resource[bar]; > return ioremap_nocache(res->start, resource_size(res)); > } > > So as it is written: I believe we don't translate the bar address > (which should be res->start). We use it as physical address. > > Or, do you mean other kinds of translation that I don't aware of? Yes, I mean translation from PCI bus address space to CPU physical address space. These two busses are different and hence need a translation. I assume Linux pci_dev::resource[] have translated address, but it is not what PCI devices see. Unless I do not terribly missing somethig, BAR addresses is what a device sees on its AD[0..31] pins. > Another silly question: I see pci_translate_addr() defined in both > lib/x86/asm/pci.h and lib/asm-generic/pci-host-bridge.h. How's that > working? So an architecutre should implement pci_translate_addr() in order to provide mapping between PCI bus address and CPU physical address. These two are the implementations. Thanks! > Thanks, > > -- peterx -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 13, 2016 at 04:16:03PM +0200, Alexander Gordeev wrote: > On Thu, Oct 13, 2016 at 02:40:35PM +0800, Peter Xu wrote: > > > > > + return pci_translate_addr(dev, addr); > > > > > > > > Raw question: do we need to translate bar addresses as well? > > > > > > I believe, yes. > > > > > > Unless we always have identity mapping between PCI address space and > > > CPU physical address space I can not realize how could it be done > > > otherwise. But even if we were, I would leave the translation routine > > > for clarity. > > > > Sorry I didn't quite catch your point. Are we talking about IOMMU > > address remapping here? IMHO BAR addresses are from CPU's point of > > view. It's only used by CPU, not device. In that case, BAR address > > should not be translated at least by IOMMU (no matter for x86/arm or > > whatever). > > > > Take Linux as example: pci_ioremap_bar() is responsible to be called > > for any PCI drivers to map device memory bars into kernel virtual > > address space. Basically it does: > > > > void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar) > > { > > struct resource *res = &pdev->resource[bar]; > > return ioremap_nocache(res->start, resource_size(res)); > > } > > > > So as it is written: I believe we don't translate the bar address > > (which should be res->start). We use it as physical address. > > > > Or, do you mean other kinds of translation that I don't aware of? > > Yes, I mean translation from PCI bus address space to CPU physical > address space. These two busses are different and hence need a > translation. I assume Linux pci_dev::resource[] have translated > address, but it is not what PCI devices see. Unless I do not terribly > missing somethig, BAR addresses is what a device sees on its AD[0..31] > pins. I believe pci_dev::resource[] should be assigned by BIOS or something before Linux. At that time, IOMMU is possibly even not inited. So no chance for a translation at all. If you see PCI Local Bus Spec Rev 3.0, chap 6.2.5.1: Power-up software needs to build a consistent address map before booting the machine to an operating system. This means it has to determine how much memory is in the system, and how much address space the I/O controllers in the system require. After determining this information, power-up software can map the I/O controllers into reasonable locations and proceed with system boot. In order to do this mapping in a device independent manner, the base registers for this mapping are placed in the predefined header portion of Configuration Space. BARs for each PCI devices should be pre-allocated during power-up software, and a consistent map is built with the knowledge of existing RAM in the system. If you boot a VM with/without IOMMU, you'll see that BAR addresses won't change before/after enabling IOMMU. Thanks, -- peterx -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Peter, On Fri, Oct 14, 2016 at 02:23:55PM +0800, Peter Xu wrote: > On Thu, Oct 13, 2016 at 04:16:03PM +0200, Alexander Gordeev wrote: > > On Thu, Oct 13, 2016 at 02:40:35PM +0800, Peter Xu wrote: > > > > > > + return pci_translate_addr(dev, addr); > > > > > > > > > > Raw question: do we need to translate bar addresses as well? > > > > > > > > I believe, yes. > > > > > > > > Unless we always have identity mapping between PCI address space and > > > > CPU physical address space I can not realize how could it be done > > > > otherwise. But even if we were, I would leave the translation routine > > > > for clarity. > > > > > > Sorry I didn't quite catch your point. Are we talking about IOMMU > > > address remapping here? IMHO BAR addresses are from CPU's point of > > > view. It's only used by CPU, not device. In that case, BAR address > > > should not be translated at least by IOMMU (no matter for x86/arm or > > > whatever). > > > > > > Take Linux as example: pci_ioremap_bar() is responsible to be called > > > for any PCI drivers to map device memory bars into kernel virtual > > > address space. Basically it does: > > > > > > void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar) > > > { > > > struct resource *res = &pdev->resource[bar]; > > > return ioremap_nocache(res->start, resource_size(res)); > > > } > > > > > > So as it is written: I believe we don't translate the bar address > > > (which should be res->start). We use it as physical address. > > > > > > Or, do you mean other kinds of translation that I don't aware of? > > > > Yes, I mean translation from PCI bus address space to CPU physical > > address space. These two busses are different and hence need a > > translation. I assume Linux pci_dev::resource[] have translated > > address, but it is not what PCI devices see. Unless I do not terribly > > missing somethig, BAR addresses is what a device sees on its AD[0..31] > > pins. > > I believe pci_dev::resource[] should be assigned by BIOS or something > before Linux. At that time, IOMMU is possibly even not inited. So no > chance for a translation at all. kvm-unit-tests != Linux kvm-unit-tests/arm doesn't have a bootloader at all (not counting QEMU initializing registers, and a handful of QEMU generated instructions that gives us a kick) Anyway, I'm glad you're reviewing this series (my PCI skills are minimal), but you'll have to review it in the right context. In this case, more of a seabios context. Thanks, drew -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Oct 14, 2016 at 08:55:20AM +0200, Andrew Jones wrote: > Hi Peter, > > On Fri, Oct 14, 2016 at 02:23:55PM +0800, Peter Xu wrote: > > On Thu, Oct 13, 2016 at 04:16:03PM +0200, Alexander Gordeev wrote: > > > On Thu, Oct 13, 2016 at 02:40:35PM +0800, Peter Xu wrote: > > > > > > > + return pci_translate_addr(dev, addr); > > > > > > > > > > > > Raw question: do we need to translate bar addresses as well? > > > > > > > > > > I believe, yes. > > > > > > > > > > Unless we always have identity mapping between PCI address space and > > > > > CPU physical address space I can not realize how could it be done > > > > > otherwise. But even if we were, I would leave the translation routine > > > > > for clarity. > > > > > > > > Sorry I didn't quite catch your point. Are we talking about IOMMU > > > > address remapping here? IMHO BAR addresses are from CPU's point of > > > > view. It's only used by CPU, not device. In that case, BAR address > > > > should not be translated at least by IOMMU (no matter for x86/arm or > > > > whatever). > > > > > > > > Take Linux as example: pci_ioremap_bar() is responsible to be called > > > > for any PCI drivers to map device memory bars into kernel virtual > > > > address space. Basically it does: > > > > > > > > void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar) > > > > { > > > > struct resource *res = &pdev->resource[bar]; > > > > return ioremap_nocache(res->start, resource_size(res)); > > > > } > > > > > > > > So as it is written: I believe we don't translate the bar address > > > > (which should be res->start). We use it as physical address. > > > > > > > > Or, do you mean other kinds of translation that I don't aware of? > > > > > > Yes, I mean translation from PCI bus address space to CPU physical > > > address space. These two busses are different and hence need a > > > translation. I assume Linux pci_dev::resource[] have translated > > > address, but it is not what PCI devices see. Unless I do not terribly > > > missing somethig, BAR addresses is what a device sees on its AD[0..31] > > > pins. > > > > I believe pci_dev::resource[] should be assigned by BIOS or something > > before Linux. At that time, IOMMU is possibly even not inited. So no > > chance for a translation at all. > > kvm-unit-tests != Linux > > kvm-unit-tests/arm doesn't have a bootloader at all (not counting QEMU > initializing registers, and a handful of QEMU generated instructions > that gives us a kick) > > Anyway, I'm glad you're reviewing this series (my PCI skills are > minimal), but you'll have to review it in the right context. In > this case, more of a seabios context. Thank you for pointing out. :-) I know little about PCI as well, just want to know the fact on how we should treat BAR addresses. So I'd say my comments are more like questions rather than I disagree with the changes. E.g., even if we don't have BIOS, do we really need to translate BAR addresses on ARM? Sorry if I brought too much noise to this thread. Looking forward to Alex's further works. Thanks! -- peterx -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Oct 14, 2016 at 02:23:55PM +0800, Peter Xu wrote: > On Thu, Oct 13, 2016 at 04:16:03PM +0200, Alexander Gordeev wrote: > > On Thu, Oct 13, 2016 at 02:40:35PM +0800, Peter Xu wrote: > > > > > > + return pci_translate_addr(dev, addr); > > > > > > > > > > Raw question: do we need to translate bar addresses as well? > > > > > > > > I believe, yes. > > > > > > > > Unless we always have identity mapping between PCI address space and > > > > CPU physical address space I can not realize how could it be done > > > > otherwise. But even if we were, I would leave the translation routine > > > > for clarity. > > > > > > Sorry I didn't quite catch your point. Are we talking about IOMMU > > > address remapping here? IMHO BAR addresses are from CPU's point of > > > view. It's only used by CPU, not device. In that case, BAR address > > > should not be translated at least by IOMMU (no matter for x86/arm or > > > whatever). > > > > > > Take Linux as example: pci_ioremap_bar() is responsible to be called > > > for any PCI drivers to map device memory bars into kernel virtual > > > address space. Basically it does: > > > > > > void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar) > > > { > > > struct resource *res = &pdev->resource[bar]; > > > return ioremap_nocache(res->start, resource_size(res)); > > > } > > > > > > So as it is written: I believe we don't translate the bar address > > > (which should be res->start). We use it as physical address. > > > > > > Or, do you mean other kinds of translation that I don't aware of? > > > > Yes, I mean translation from PCI bus address space to CPU physical > > address space. These two busses are different and hence need a > > translation. I assume Linux pci_dev::resource[] have translated > > address, but it is not what PCI devices see. Unless I do not terribly > > missing somethig, BAR addresses is what a device sees on its AD[0..31] > > pins. > > I believe pci_dev::resource[] should be assigned by BIOS or something > before Linux. At that time, IOMMU is possibly even not inited. So no > chance for a translation at all. So pci-host-generic.c is "BIOS or something" in this context. There is no contradiction with the your excerpt. Please note "Figure 1-2: PCI System Block Diagram" in the spec. The CPU and PCI busses are two physically different busses interconnected by a bridge. A CPU data access to PCI devices initiates data access on a PCI bus, but adresses seen by the CPU and PCI devices do not necessarily match. So in case of x86 they do (lib/x86/asm/pci.h): static inline phys_addr_t pci_translate_addr(pcidevaddr_t dev __unused, uint64_t addr) { return addr; } But in case of ARM they might not (lib/asm-generic/pci-host-bridge.h): static inline phys_addr_t pci_translate_addr(pcidevaddr_t dev __unused, uint64_t addr) { /* * Assume we only have single PCI host bridge in a system. */ return pci_host_bridge_get_paddr(addr); } > If you see PCI Local Bus Spec Rev 3.0, chap 6.2.5.1: > > Power-up software needs to build a consistent address map before > booting the machine to an operating system. This means it has to > determine how much memory is in the system, and how much address > space the I/O controllers in the system require. After determining > this information, power-up software can map the I/O controllers > into reasonable locations and proceed with system boot. In order > to do this mapping in a device independent manner, the base > registers for this mapping are placed in the predefined header > portion of Configuration Space. > > BARs for each PCI devices should be pre-allocated during power-up > software, and a consistent map is built with the knowledge of existing > RAM in the system. Yep, see how pci_probe() / pci_alloc_resource() allocate and assign BARs to each device. It is the case for PPC/ARM but in case of x86 it is BIOS who does it. > If you boot a VM with/without IOMMU, you'll see that BAR addresses > won't change before/after enabling IOMMU. BAR addresses are in PCI bus address space. It sounds quite logical. (I am not experienced with IOMMU address translation, though). > Thanks, > > -- peterx -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Oct 14, 2016 at 02:37:25PM +0200, Alexander Gordeev wrote: [...] > So pci-host-generic.c is "BIOS or something" in this context. There is > no contradiction with the your excerpt. Please note "Figure 1-2: PCI > System Block Diagram" in the spec. The CPU and PCI busses are two > physically different busses interconnected by a bridge. > > A CPU data access to PCI devices initiates data access on a PCI bus, > but adresses seen by the CPU and PCI devices do not necessarily match. > > So in case of x86 they do (lib/x86/asm/pci.h): > > static inline > phys_addr_t pci_translate_addr(pcidevaddr_t dev __unused, uint64_t addr) > { > return addr; > } > > But in case of ARM they might not (lib/asm-generic/pci-host-bridge.h): > > static inline > phys_addr_t pci_translate_addr(pcidevaddr_t dev __unused, uint64_t addr) > { > /* > * Assume we only have single PCI host bridge in a system. > */ > return pci_host_bridge_get_paddr(addr); > } > > > If you see PCI Local Bus Spec Rev 3.0, chap 6.2.5.1: > > > > Power-up software needs to build a consistent address map before > > booting the machine to an operating system. This means it has to > > determine how much memory is in the system, and how much address > > space the I/O controllers in the system require. After determining > > this information, power-up software can map the I/O controllers > > into reasonable locations and proceed with system boot. In order > > to do this mapping in a device independent manner, the base > > registers for this mapping are placed in the predefined header > > portion of Configuration Space. > > > > BARs for each PCI devices should be pre-allocated during power-up > > software, and a consistent map is built with the knowledge of existing > > RAM in the system. > > Yep, see how pci_probe() / pci_alloc_resource() allocate and assign > BARs to each device. It is the case for PPC/ARM but in case of x86 > it is BIOS who does it. > > > If you boot a VM with/without IOMMU, you'll see that BAR addresses > > won't change before/after enabling IOMMU. > > BAR addresses are in PCI bus address space. It sounds quite logical. > (I am not experienced with IOMMU address translation, though). Thank you for the thorough explanation. I am still not sure I fully understand the whole thing, but looks like the translation here is not the same as x86 IOMMU address translations, and x86 is special in this case (BAR addresses should be the same as physical addresses in x86, while not for ARM/PPC). I am sorry for the noise. Looking forward to your next version. Thanks! -- peterx -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/lib/pci.c b/lib/pci.c index ce481bbfadb6..e69e828523d4 100644 --- a/lib/pci.c +++ b/lib/pci.c @@ -21,19 +21,71 @@ pcidevaddr_t pci_find_dev(uint16_t vendor_id, uint16_t device_id) return PCIDEVADDR_INVALID; } +static uint32_t pci_bar_mask(uint32_t bar) +{ + return (bar & PCI_BASE_ADDRESS_SPACE_IO) ? + PCI_BASE_ADDRESS_IO_MASK : PCI_BASE_ADDRESS_MEM_MASK; +} + static uint32_t pci_bar_get(pcidevaddr_t dev, int bar_num) { return pci_config_readl(dev, PCI_BASE_ADDRESS_0 + bar_num * 4); } -unsigned long pci_bar_addr(pcidevaddr_t dev, int bar_num) +phys_addr_t pci_bar_addr(pcidevaddr_t dev, int bar_num) { uint32_t bar = pci_bar_get(dev, bar_num); + uint32_t mask = pci_bar_mask(bar); + uint64_t addr = bar & mask; - if (bar & PCI_BASE_ADDRESS_SPACE_IO) - return bar & PCI_BASE_ADDRESS_IO_MASK; - else - return bar & PCI_BASE_ADDRESS_MEM_MASK; + if (pci_bar_is64(dev, bar_num)) + addr |= (uint64_t)pci_bar_get(dev, bar_num + 1) << 32; + + return pci_translate_addr(dev, addr); +} + +/* + * To determine the amount of address space needed by a PCI device, + * one must save the original value of the BAR, write a value of + * all 1's to the register, and then read it back. The amount of + * memory can be then determined by masking the information bits, + * performing a bitwise NOT, and incrementing the value by 1. + * + * The following pci_bar_size_helper() and pci_bar_size() functions + * implement the algorithm. + */ +static uint32_t pci_bar_size_helper(pcidevaddr_t dev, int bar_num) +{ + int off = PCI_BASE_ADDRESS_0 + bar_num * 4; + uint32_t bar, size; + + bar = pci_config_readl(dev, off); + pci_config_writel(dev, off, ~0u); + size = pci_config_readl(dev, off); + pci_config_writel(dev, off, bar); + + return size; +} + +phys_addr_t pci_bar_size(pcidevaddr_t dev, int bar_num) +{ + uint32_t bar, size; + + size = pci_bar_size_helper(dev, bar_num); + if (!size) + return 0; + + bar = pci_bar_get(dev, bar_num); + size &= pci_bar_mask(bar); + + if (pci_bar_is64(dev, bar_num)) { + phys_addr_t size64 = pci_bar_size_helper(dev, bar_num + 1); + size64 = (size64 << 32) | size; + + return ~size64 + 1; + } else { + return ~size + 1; + } } bool pci_bar_is_memory(pcidevaddr_t dev, int bar_num) @@ -47,3 +99,14 @@ bool pci_bar_is_valid(pcidevaddr_t dev, int bar_num) { return pci_bar_get(dev, bar_num); } + +bool pci_bar_is64(pcidevaddr_t dev, int bar_num) +{ + uint32_t bar = pci_bar_get(dev, bar_num); + + if (bar & PCI_BASE_ADDRESS_SPACE_IO) + return false; + + return (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == + PCI_BASE_ADDRESS_MEM_TYPE_64; +} diff --git a/lib/pci.h b/lib/pci.h index 066fac77b237..8eec236cb123 100644 --- a/lib/pci.h +++ b/lib/pci.h @@ -16,7 +16,22 @@ enum { }; extern pcidevaddr_t pci_find_dev(uint16_t vendor_id, uint16_t device_id); -extern unsigned long pci_bar_addr(pcidevaddr_t dev, int bar_num); + +/* + * @bar_num in all BAR access functions below is the index of the 32-bit + * register starting from the PCI_BASE_ADDRESS_0 offset. + * + * In cases where the BAR size is 64-bit, a caller should still provide + * @bar_num in terms of 32-bit words. For example, if a device has a 64-bit + * BAR#0 and a 32-bit BAR#1, then caller should provide 2 to address BAR#1, + * not 1. + * + * It is expected the caller is aware of the device BAR layout and never + * tries to address the middle of a 64-bit register. + */ +extern phys_addr_t pci_bar_addr(pcidevaddr_t dev, int bar_num); +extern phys_addr_t pci_bar_size(pcidevaddr_t dev, int bar_num); +extern bool pci_bar_is64(pcidevaddr_t dev, int bar_num); extern bool pci_bar_is_memory(pcidevaddr_t dev, int bar_num); extern bool pci_bar_is_valid(pcidevaddr_t dev, int bar_num); diff --git a/lib/x86/asm/pci.h b/lib/x86/asm/pci.h index 821a2c1e180a..7384b91adba1 100644 --- a/lib/x86/asm/pci.h +++ b/lib/x86/asm/pci.h @@ -35,4 +35,10 @@ static inline void pci_config_writel(pcidevaddr_t dev, uint8_t reg, uint32_t val outl(val, 0xCFC); } +static inline +phys_addr_t pci_translate_addr(pcidevaddr_t dev __unused, uint64_t addr) +{ + return addr; +} + #endif