diff mbox series

[v4,3/7] PCI: endpoint: Introduce pci_epc_map_align()

Message ID 20241007040319.157412-4-dlemoal@kernel.org (mailing list archive)
State Superseded
Headers show
Series Improve PCI memory mapping API | expand

Commit Message

Damien Le Moal Oct. 7, 2024, 4:03 a.m. UTC
Some endpoint controllers have requirements on the alignment of the
controller physical memory address that must be used to map a RC PCI
address region. For instance, the rockchip endpoint controller uses
at most the lower 20 bits of a physical memory address region as the
lower bits of an RC PCI address. For mapping a PCI address region of
size bytes starting from pci_addr, the exact number of address bits
used is the number of address bits changing in the address range
[pci_addr..pci_addr + size - 1].

For this example, this creates the following constraints:
1) The offset into the controller physical memory allocated for a
   mapping depends on the mapping size *and* the starting PCI address
   for the mapping.
2) A mapping size cannot exceed the controller windows size (1MB) minus
   the offset needed into the allocated physical memory, which can end
   up being a smaller size than the desired mapping size.

Handling these constraints independently of the controller being used
in an endpoint function driver is not possible with the current EPC
API as only the ->align field in struct pci_epc_features is provided
and used for BAR (inbound ATU mappings) mapping. A new API is needed
for function drivers to discover mapping constraints and handle
non-static requirements based on the RC PCI address range to access.

Introduce the function pci_epc_map_align() and the endpoint controller
operation ->map_align to allow endpoint function drivers to obtain the
size and the offset into a controller address region that must be
allocated and mapped to access an RC PCI address region. The size
of the mapping provided by pci_epc_map_align() can then be used as the
size argument for the function pci_epc_mem_alloc_addr().
The offset into the allocated controller memory provided can be used to
correctly handle data transfers.

For endpoint controllers that have PCI address alignment constraints,
pci_epc_map_align() may indicate upon return an effective PCI address
region mapping size that is smaller (but not 0) than the requested PCI
address region size. For such case, an endpoint function driver must
handle data accesses over the desired PCI address range in fragments,
by repeatedly using pci_epc_map_align() over the PCI address range.

The controller operation ->map_align is optional: controllers that do
not have any alignment constraints for mapping a RC PCI address region
do not need to implement this operation. For such controllers,
pci_epc_map_align() always returns the mapping size as equal to the
requested size of the PCI region and an offset equal to 0.

The new structure struct pci_epc_map is introduced to represent a
mapping start PCI address, mapping effective size, the size and offset
into the controller memory needed for mapping the PCI address region as
well as the physical and virtual CPU addresses of the mapping (phys_base
and virt_base fields). For convenience, the physical and virtual CPU
addresses within that mapping to access the target RC PCI address region
are also provided (phys_addr and virt_addr fields).

Co-developed-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
---
 drivers/pci/endpoint/pci-epc-core.c | 66 +++++++++++++++++++++++++++++
 include/linux/pci-epc.h             | 37 ++++++++++++++++
 2 files changed, 103 insertions(+)

Comments

Manivannan Sadhasivam Oct. 10, 2024, 2:36 p.m. UTC | #1
On Mon, Oct 07, 2024 at 01:03:15PM +0900, Damien Le Moal wrote:
> Some endpoint controllers have requirements on the alignment of the
> controller physical memory address that must be used to map a RC PCI
> address region. For instance, the rockchip endpoint controller uses
> at most the lower 20 bits of a physical memory address region as the
> lower bits of an RC PCI address. For mapping a PCI address region of
> size bytes starting from pci_addr, the exact number of address bits
> used is the number of address bits changing in the address range
> [pci_addr..pci_addr + size - 1].
> 
> For this example, this creates the following constraints:
> 1) The offset into the controller physical memory allocated for a
>    mapping depends on the mapping size *and* the starting PCI address
>    for the mapping.
> 2) A mapping size cannot exceed the controller windows size (1MB) minus
>    the offset needed into the allocated physical memory, which can end
>    up being a smaller size than the desired mapping size.
> 
> Handling these constraints independently of the controller being used
> in an endpoint function driver is not possible with the current EPC
> API as only the ->align field in struct pci_epc_features is provided
> and used for BAR (inbound ATU mappings) mapping. A new API is needed
> for function drivers to discover mapping constraints and handle
> non-static requirements based on the RC PCI address range to access.
> 
> Introduce the function pci_epc_map_align() and the endpoint controller
> operation ->map_align to allow endpoint function drivers to obtain the
> size and the offset into a controller address region that must be
> allocated and mapped to access an RC PCI address region. The size
> of the mapping provided by pci_epc_map_align() can then be used as the
> size argument for the function pci_epc_mem_alloc_addr().
> The offset into the allocated controller memory provided can be used to
> correctly handle data transfers.
> 
> For endpoint controllers that have PCI address alignment constraints,
> pci_epc_map_align() may indicate upon return an effective PCI address
> region mapping size that is smaller (but not 0) than the requested PCI
> address region size. For such case, an endpoint function driver must
> handle data accesses over the desired PCI address range in fragments,
> by repeatedly using pci_epc_map_align() over the PCI address range.
> 
> The controller operation ->map_align is optional: controllers that do
> not have any alignment constraints for mapping a RC PCI address region
> do not need to implement this operation. For such controllers,
> pci_epc_map_align() always returns the mapping size as equal to the
> requested size of the PCI region and an offset equal to 0.
> 
> The new structure struct pci_epc_map is introduced to represent a
> mapping start PCI address, mapping effective size, the size and offset
> into the controller memory needed for mapping the PCI address region as
> well as the physical and virtual CPU addresses of the mapping (phys_base
> and virt_base fields). For convenience, the physical and virtual CPU
> addresses within that mapping to access the target RC PCI address region
> are also provided (phys_addr and virt_addr fields).
> 

I'm fine with the concept of this patch, but I don't get why you need an API for
this and not just a callback to be used in the pci_epc_mem_{map/unmap} APIs.
Furthermore, I don't see an user of this API (in 3 series you've sent out so
far). Let me know if I failed to spot it.

Also, the API name pci_epc_map_align() sounds like it does the mapping, but it
doesn't. So I'd not have it exposed as an API at all.

- Mani

> Co-developed-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
> Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> Reviewed-by: Niklas Cassel <cassel@kernel.org>
> ---
>  drivers/pci/endpoint/pci-epc-core.c | 66 +++++++++++++++++++++++++++++
>  include/linux/pci-epc.h             | 37 ++++++++++++++++
>  2 files changed, 103 insertions(+)
> 
> diff --git a/drivers/pci/endpoint/pci-epc-core.c b/drivers/pci/endpoint/pci-epc-core.c
> index b854f1bab26f..1adccf07c33e 100644
> --- a/drivers/pci/endpoint/pci-epc-core.c
> +++ b/drivers/pci/endpoint/pci-epc-core.c
> @@ -435,6 +435,72 @@ void pci_epc_unmap_addr(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
>  }
>  EXPORT_SYMBOL_GPL(pci_epc_unmap_addr);
>  
> +/**
> + * pci_epc_map_align() - Get the offset into and the size of a controller memory
> + *			 address region needed to map a RC PCI address region
> + * @epc: the EPC device on which address is allocated
> + * @func_no: the physical endpoint function number in the EPC device
> + * @vfunc_no: the virtual endpoint function number in the physical function
> + * @pci_addr: The RC PCI address to map
> + * @pci_size: the number of bytes to map starting from @pci_addr
> + * @map: populate here the actual size and offset into the controller memory
> + *       that must be allocated for the mapping
> + *
> + * If defined, invoke the controller map_align operation to obtain the size and
> + * the offset into a controller address region that must be allocated to map
> + * @pci_size bytes of the RC PCI address space starting from @pci_addr.
> + *
> + * On return, @map->pci_size indicates the effective size of the mapping that
> + * can be handled by the controller. This size may be smaller than the requested
> + * @pci_size. In such case, the endpoint function driver must handle the mapping
> + * using several fragments. The offset into the controller memory for the
> + * effective mapping of the RC PCI address range
> + * @pci_addr..@pci_addr+@map->pci_size is indicated by @map->map_ofst.
> + *
> + * If the target controller does not define a map_align operation, it is assumed
> + * that the controller has no PCI address mapping alignment constraint.
> + */
> +int pci_epc_map_align(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> +		      u64 pci_addr, size_t pci_size, struct pci_epc_map *map)
> +{
> +	int ret;
> +
> +	if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
> +		return -EINVAL;
> +
> +	if (!pci_size || !map)
> +		return -EINVAL;
> +
> +	/*
> +	 * Initialize and remember the PCI address region to be mapped. The
> +	 * controller ->map_align() operation may change the map->pci_size to a
> +	 * smaller value.
> +	 */
> +	memset(map, 0, sizeof(*map));
> +	map->pci_addr = pci_addr;
> +	map->pci_size = pci_size;
> +
> +	if (!epc->ops->map_align) {
> +		/*
> +		 * Assume that the EP controller has no alignment constraint,
> +		 * that is, that the PCI address to map and the size of the
> +		 * controller memory needed for the mapping are the same as
> +		 * specified by the caller.
> +		 */
> +		map->map_pci_addr = pci_addr;
> +		map->map_size = pci_size;
> +		map->map_ofst = 0;
> +		return 0;
> +	}
> +
> +	mutex_lock(&epc->lock);
> +	ret = epc->ops->map_align(epc, func_no, vfunc_no, map);
> +	mutex_unlock(&epc->lock);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(pci_epc_map_align);
> +
>  /**
>   * pci_epc_map_addr() - map CPU address to PCI address
>   * @epc: the EPC device on which address is allocated
> diff --git a/include/linux/pci-epc.h b/include/linux/pci-epc.h
> index 42ef06136bd1..9df8a83e8d10 100644
> --- a/include/linux/pci-epc.h
> +++ b/include/linux/pci-epc.h
> @@ -32,11 +32,44 @@ pci_epc_interface_string(enum pci_epc_interface_type type)
>  	}
>  }
>  
> +/**
> + * struct pci_epc_map - information about EPC memory for mapping a RC PCI
> + *                      address range
> + * @pci_addr: start address of the RC PCI address range to map
> + * @pci_size: size of the RC PCI address range mapped from @pci_addr
> + * @map_pci_addr: RC PCI address used as the first address mapped (may be lower
> + *                than @pci_addr)
> + * @map_size: size of the controller memory needed for mapping the RC PCI address
> + *            range @pci_addr..@pci_addr+@pci_size
> + * @map_ofst: offset into the mapped controller memory to access @pci_addr
> + * @phys_base: base physical address of the allocated EPC memory for mapping the
> + *             RC PCI address range
> + * @phys_addr: physical address at which @pci_addr is mapped
> + * @virt_base: base virtual address of the allocated EPC memory for mapping the
> + *             RC PCI address range
> + * @virt_addr: virtual address at which @pci_addr is mapped
> + */
> +struct pci_epc_map {
> +	phys_addr_t	pci_addr;
> +	size_t		pci_size;
> +
> +	phys_addr_t	map_pci_addr;
> +	size_t		map_size;
> +	phys_addr_t	map_ofst;
> +
> +	phys_addr_t	phys_base;
> +	phys_addr_t	phys_addr;
> +	void __iomem	*virt_base;
> +	void __iomem	*virt_addr;
> +};
> +
>  /**
>   * struct pci_epc_ops - set of function pointers for performing EPC operations
>   * @write_header: ops to populate configuration space header
>   * @set_bar: ops to configure the BAR
>   * @clear_bar: ops to reset the BAR
> + * @map_align: operation to get the size and offset into a controller memory
> + *             window needed to map an RC PCI address region
>   * @map_addr: ops to map CPU address to PCI address
>   * @unmap_addr: ops to unmap CPU address and PCI address
>   * @set_msi: ops to set the requested number of MSI interrupts in the MSI
> @@ -61,6 +94,8 @@ struct pci_epc_ops {
>  			   struct pci_epf_bar *epf_bar);
>  	void	(*clear_bar)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
>  			     struct pci_epf_bar *epf_bar);
> +	int	(*map_align)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> +			    struct pci_epc_map *map);
>  	int	(*map_addr)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
>  			    phys_addr_t addr, u64 pci_addr, size_t size);
>  	void	(*unmap_addr)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> @@ -241,6 +276,8 @@ int pci_epc_set_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
>  		    struct pci_epf_bar *epf_bar);
>  void pci_epc_clear_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
>  		       struct pci_epf_bar *epf_bar);
> +int pci_epc_map_align(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> +		      u64 pci_addr, size_t size, struct pci_epc_map *map);
>  int pci_epc_map_addr(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
>  		     phys_addr_t phys_addr,
>  		     u64 pci_addr, size_t size);
> -- 
> 2.46.2
>
Damien Le Moal Oct. 11, 2024, 1:07 a.m. UTC | #2
On 10/10/24 23:36, Manivannan Sadhasivam wrote:
> On Mon, Oct 07, 2024 at 01:03:15PM +0900, Damien Le Moal wrote:
>> Some endpoint controllers have requirements on the alignment of the
>> controller physical memory address that must be used to map a RC PCI
>> address region. For instance, the rockchip endpoint controller uses
>> at most the lower 20 bits of a physical memory address region as the
>> lower bits of an RC PCI address. For mapping a PCI address region of
>> size bytes starting from pci_addr, the exact number of address bits
>> used is the number of address bits changing in the address range
>> [pci_addr..pci_addr + size - 1].
>>
>> For this example, this creates the following constraints:
>> 1) The offset into the controller physical memory allocated for a
>>    mapping depends on the mapping size *and* the starting PCI address
>>    for the mapping.
>> 2) A mapping size cannot exceed the controller windows size (1MB) minus
>>    the offset needed into the allocated physical memory, which can end
>>    up being a smaller size than the desired mapping size.
>>
>> Handling these constraints independently of the controller being used
>> in an endpoint function driver is not possible with the current EPC
>> API as only the ->align field in struct pci_epc_features is provided
>> and used for BAR (inbound ATU mappings) mapping. A new API is needed
>> for function drivers to discover mapping constraints and handle
>> non-static requirements based on the RC PCI address range to access.
>>
>> Introduce the function pci_epc_map_align() and the endpoint controller
>> operation ->map_align to allow endpoint function drivers to obtain the
>> size and the offset into a controller address region that must be
>> allocated and mapped to access an RC PCI address region. The size
>> of the mapping provided by pci_epc_map_align() can then be used as the
>> size argument for the function pci_epc_mem_alloc_addr().
>> The offset into the allocated controller memory provided can be used to
>> correctly handle data transfers.
>>
>> For endpoint controllers that have PCI address alignment constraints,
>> pci_epc_map_align() may indicate upon return an effective PCI address
>> region mapping size that is smaller (but not 0) than the requested PCI
>> address region size. For such case, an endpoint function driver must
>> handle data accesses over the desired PCI address range in fragments,
>> by repeatedly using pci_epc_map_align() over the PCI address range.
>>
>> The controller operation ->map_align is optional: controllers that do
>> not have any alignment constraints for mapping a RC PCI address region
>> do not need to implement this operation. For such controllers,
>> pci_epc_map_align() always returns the mapping size as equal to the
>> requested size of the PCI region and an offset equal to 0.
>>
>> The new structure struct pci_epc_map is introduced to represent a
>> mapping start PCI address, mapping effective size, the size and offset
>> into the controller memory needed for mapping the PCI address region as
>> well as the physical and virtual CPU addresses of the mapping (phys_base
>> and virt_base fields). For convenience, the physical and virtual CPU
>> addresses within that mapping to access the target RC PCI address region
>> are also provided (phys_addr and virt_addr fields).
>>
> 
> I'm fine with the concept of this patch, but I don't get why you need an API for
> this and not just a callback to be used in the pci_epc_mem_{map/unmap} APIs.
> Furthermore, I don't see an user of this API (in 3 series you've sent out so
> far). Let me know if I failed to spot it.
> 
> Also, the API name pci_epc_map_align() sounds like it does the mapping, but it
> doesn't. So I'd not have it exposed as an API at all.

OK. Fine with me. I will move this inside pci_epc_mem_map(). But note that
without this function, pci_epc_mem_alloc_addr() and pci_epc_map_addr() are
totally useless for EP controllers that have a mapping alignment requirement,
which without the pci_epc_map_align() function, an endpoint function driver
cannot discover *at all* currently. That does not fix the overall API of EPC...

By not having pci_epc_map_align(), pci_epc_mem_alloc_addr() and
pci_epc_map_addr() remain broken, but the introduction of pci_epc_mem_map() does
provide a working solution for the general case.

So I think we will still need to do something about this bad state of the API later.
Manivannan Sadhasivam Oct. 12, 2024, 6:32 a.m. UTC | #3
On Fri, Oct 11, 2024 at 10:07:30AM +0900, Damien Le Moal wrote:
> On 10/10/24 23:36, Manivannan Sadhasivam wrote:
> > On Mon, Oct 07, 2024 at 01:03:15PM +0900, Damien Le Moal wrote:
> >> Some endpoint controllers have requirements on the alignment of the
> >> controller physical memory address that must be used to map a RC PCI
> >> address region. For instance, the rockchip endpoint controller uses
> >> at most the lower 20 bits of a physical memory address region as the
> >> lower bits of an RC PCI address. For mapping a PCI address region of
> >> size bytes starting from pci_addr, the exact number of address bits
> >> used is the number of address bits changing in the address range
> >> [pci_addr..pci_addr + size - 1].
> >>
> >> For this example, this creates the following constraints:
> >> 1) The offset into the controller physical memory allocated for a
> >>    mapping depends on the mapping size *and* the starting PCI address
> >>    for the mapping.
> >> 2) A mapping size cannot exceed the controller windows size (1MB) minus
> >>    the offset needed into the allocated physical memory, which can end
> >>    up being a smaller size than the desired mapping size.
> >>
> >> Handling these constraints independently of the controller being used
> >> in an endpoint function driver is not possible with the current EPC
> >> API as only the ->align field in struct pci_epc_features is provided
> >> and used for BAR (inbound ATU mappings) mapping. A new API is needed
> >> for function drivers to discover mapping constraints and handle
> >> non-static requirements based on the RC PCI address range to access.
> >>
> >> Introduce the function pci_epc_map_align() and the endpoint controller
> >> operation ->map_align to allow endpoint function drivers to obtain the
> >> size and the offset into a controller address region that must be
> >> allocated and mapped to access an RC PCI address region. The size
> >> of the mapping provided by pci_epc_map_align() can then be used as the
> >> size argument for the function pci_epc_mem_alloc_addr().
> >> The offset into the allocated controller memory provided can be used to
> >> correctly handle data transfers.
> >>
> >> For endpoint controllers that have PCI address alignment constraints,
> >> pci_epc_map_align() may indicate upon return an effective PCI address
> >> region mapping size that is smaller (but not 0) than the requested PCI
> >> address region size. For such case, an endpoint function driver must
> >> handle data accesses over the desired PCI address range in fragments,
> >> by repeatedly using pci_epc_map_align() over the PCI address range.
> >>
> >> The controller operation ->map_align is optional: controllers that do
> >> not have any alignment constraints for mapping a RC PCI address region
> >> do not need to implement this operation. For such controllers,
> >> pci_epc_map_align() always returns the mapping size as equal to the
> >> requested size of the PCI region and an offset equal to 0.
> >>
> >> The new structure struct pci_epc_map is introduced to represent a
> >> mapping start PCI address, mapping effective size, the size and offset
> >> into the controller memory needed for mapping the PCI address region as
> >> well as the physical and virtual CPU addresses of the mapping (phys_base
> >> and virt_base fields). For convenience, the physical and virtual CPU
> >> addresses within that mapping to access the target RC PCI address region
> >> are also provided (phys_addr and virt_addr fields).
> >>
> > 
> > I'm fine with the concept of this patch, but I don't get why you need an API for
> > this and not just a callback to be used in the pci_epc_mem_{map/unmap} APIs.
> > Furthermore, I don't see an user of this API (in 3 series you've sent out so
> > far). Let me know if I failed to spot it.
> > 
> > Also, the API name pci_epc_map_align() sounds like it does the mapping, but it
> > doesn't. So I'd not have it exposed as an API at all.
> 
> OK. Fine with me. I will move this inside pci_epc_mem_map(). But note that
> without this function, pci_epc_mem_alloc_addr() and pci_epc_map_addr() are
> totally useless for EP controllers that have a mapping alignment requirement,
> which without the pci_epc_map_align() function, an endpoint function driver
> cannot discover *at all* currently. That does not fix the overall API of EPC...
> 

Not at all. EPF drivers still can use "epf_mhi->epc_features->align" to discover
the alignment requirement and calculate the offset on their own (please see
pci-epf-mhi). But I'm not in favor of that approach since the APIs need to do
that job and that's why I like your pci_epc_mem_map() API.

> By not having pci_epc_map_align(), pci_epc_mem_alloc_addr() and
> pci_epc_map_addr() remain broken, but the introduction of pci_epc_mem_map() does
> provide a working solution for the general case.
> 
> So I think we will still need to do something about this bad state of the API later.
> 

We can always rework the APIs to incorporate the alignment requirement.

- Mani
Damien Le Moal Oct. 12, 2024, 8:30 a.m. UTC | #4
On 10/12/24 15:32, Manivannan Sadhasivam wrote:
> On Fri, Oct 11, 2024 at 10:07:30AM +0900, Damien Le Moal wrote:
>> On 10/10/24 23:36, Manivannan Sadhasivam wrote:
>>> On Mon, Oct 07, 2024 at 01:03:15PM +0900, Damien Le Moal wrote:
>>>> Some endpoint controllers have requirements on the alignment of the
>>>> controller physical memory address that must be used to map a RC PCI
>>>> address region. For instance, the rockchip endpoint controller uses
>>>> at most the lower 20 bits of a physical memory address region as the
>>>> lower bits of an RC PCI address. For mapping a PCI address region of
>>>> size bytes starting from pci_addr, the exact number of address bits
>>>> used is the number of address bits changing in the address range
>>>> [pci_addr..pci_addr + size - 1].
>>>>
>>>> For this example, this creates the following constraints:
>>>> 1) The offset into the controller physical memory allocated for a
>>>>    mapping depends on the mapping size *and* the starting PCI address
>>>>    for the mapping.
>>>> 2) A mapping size cannot exceed the controller windows size (1MB) minus
>>>>    the offset needed into the allocated physical memory, which can end
>>>>    up being a smaller size than the desired mapping size.
>>>>
>>>> Handling these constraints independently of the controller being used
>>>> in an endpoint function driver is not possible with the current EPC
>>>> API as only the ->align field in struct pci_epc_features is provided
>>>> and used for BAR (inbound ATU mappings) mapping. A new API is needed
>>>> for function drivers to discover mapping constraints and handle
>>>> non-static requirements based on the RC PCI address range to access.
>>>>
>>>> Introduce the function pci_epc_map_align() and the endpoint controller
>>>> operation ->map_align to allow endpoint function drivers to obtain the
>>>> size and the offset into a controller address region that must be
>>>> allocated and mapped to access an RC PCI address region. The size
>>>> of the mapping provided by pci_epc_map_align() can then be used as the
>>>> size argument for the function pci_epc_mem_alloc_addr().
>>>> The offset into the allocated controller memory provided can be used to
>>>> correctly handle data transfers.
>>>>
>>>> For endpoint controllers that have PCI address alignment constraints,
>>>> pci_epc_map_align() may indicate upon return an effective PCI address
>>>> region mapping size that is smaller (but not 0) than the requested PCI
>>>> address region size. For such case, an endpoint function driver must
>>>> handle data accesses over the desired PCI address range in fragments,
>>>> by repeatedly using pci_epc_map_align() over the PCI address range.
>>>>
>>>> The controller operation ->map_align is optional: controllers that do
>>>> not have any alignment constraints for mapping a RC PCI address region
>>>> do not need to implement this operation. For such controllers,
>>>> pci_epc_map_align() always returns the mapping size as equal to the
>>>> requested size of the PCI region and an offset equal to 0.
>>>>
>>>> The new structure struct pci_epc_map is introduced to represent a
>>>> mapping start PCI address, mapping effective size, the size and offset
>>>> into the controller memory needed for mapping the PCI address region as
>>>> well as the physical and virtual CPU addresses of the mapping (phys_base
>>>> and virt_base fields). For convenience, the physical and virtual CPU
>>>> addresses within that mapping to access the target RC PCI address region
>>>> are also provided (phys_addr and virt_addr fields).
>>>>
>>>
>>> I'm fine with the concept of this patch, but I don't get why you need an API for
>>> this and not just a callback to be used in the pci_epc_mem_{map/unmap} APIs.
>>> Furthermore, I don't see an user of this API (in 3 series you've sent out so
>>> far). Let me know if I failed to spot it.
>>>
>>> Also, the API name pci_epc_map_align() sounds like it does the mapping, but it
>>> doesn't. So I'd not have it exposed as an API at all.
>>
>> OK. Fine with me. I will move this inside pci_epc_mem_map(). But note that
>> without this function, pci_epc_mem_alloc_addr() and pci_epc_map_addr() are
>> totally useless for EP controllers that have a mapping alignment requirement,
>> which without the pci_epc_map_align() function, an endpoint function driver
>> cannot discover *at all* currently. That does not fix the overall API of EPC...
>>
> 
> Not at all. EPF drivers still can use "epf_mhi->epc_features->align" to discover
> the alignment requirement and calculate the offset on their own (please see
> pci-epf-mhi). But I'm not in favor of that approach since the APIs need to do
> that job and that's why I like your pci_epc_mem_map() API.

That is *not* correct, at least in general. For two reasons:
1) epc_features->align defines alignment for BARs, that is, inbound windows
memory. It is not supposed to be about the outbound windows for mapping PCI
address space for doing mmio or DMA. Some controllers may have the same
alignment constraint for both ib and ob, in which case things will work, but
that is "just being lucky". I spent weeks with the RK3399 understanding that I
was not lucky with that one :)
2) A static alignment constraint does not work for all controllers. C.f. my
series fixing the RK3399 were I think I clearly explain that alignment of a
mapping depends on the PCI address AND the size being mapped, as both determine
the number of bits of address changing within the PCI address range to access.
Using a fixed boundary alignment for the RK3399 simply does not work at all. An
epf cannot know that simply looking at a fixed value...

What you said may be true for the mhi epf, because it requires special hardware
that has a simple fixed alignment constraint. ntb and vntb are also coded
assuming such constraint. So If I try to run ntb or vntg on the RK3399 it will
likely not work (actually it may, but out of sheer luck given that the addresses
that will be mapped will likely be aligned to 1MB, that is, the memory window size).

Developping the nvme epf driver where I was seeing completely random PCI
addresses for command buffers, I could make things work only after developping
the pci_epc_mem_map() with the controller operation telling the mapping
(.get_mem_map()) for every address to map.

> 
>> By not having pci_epc_map_align(), pci_epc_mem_alloc_addr() and
>> pci_epc_map_addr() remain broken, but the introduction of pci_epc_mem_map() does
>> provide a working solution for the general case.
>>
>> So I think we will still need to do something about this bad state of the API later.
>>
> 
> We can always rework the APIs to incorporate the alignment requirement.

See above. An API that advertise a simple alignment requirement will not work
for all controllers... But anyway, given that we are not getting any problem
report, people using the EP framework likely have setups that combine
controllers and endpoint drivers playing well together. So I do not think there
is any urgency about the API. I really do need this series for the nvme endpoint
driver though, as a first step for the API improvement.
Manivannan Sadhasivam Oct. 12, 2024, 9:40 a.m. UTC | #5
On Sat, Oct 12, 2024 at 05:30:29PM +0900, Damien Le Moal wrote:
> On 10/12/24 15:32, Manivannan Sadhasivam wrote:
> > On Fri, Oct 11, 2024 at 10:07:30AM +0900, Damien Le Moal wrote:
> >> On 10/10/24 23:36, Manivannan Sadhasivam wrote:
> >>> On Mon, Oct 07, 2024 at 01:03:15PM +0900, Damien Le Moal wrote:
> >>>> Some endpoint controllers have requirements on the alignment of the
> >>>> controller physical memory address that must be used to map a RC PCI
> >>>> address region. For instance, the rockchip endpoint controller uses
> >>>> at most the lower 20 bits of a physical memory address region as the
> >>>> lower bits of an RC PCI address. For mapping a PCI address region of
> >>>> size bytes starting from pci_addr, the exact number of address bits
> >>>> used is the number of address bits changing in the address range
> >>>> [pci_addr..pci_addr + size - 1].
> >>>>
> >>>> For this example, this creates the following constraints:
> >>>> 1) The offset into the controller physical memory allocated for a
> >>>>    mapping depends on the mapping size *and* the starting PCI address
> >>>>    for the mapping.
> >>>> 2) A mapping size cannot exceed the controller windows size (1MB) minus
> >>>>    the offset needed into the allocated physical memory, which can end
> >>>>    up being a smaller size than the desired mapping size.
> >>>>
> >>>> Handling these constraints independently of the controller being used
> >>>> in an endpoint function driver is not possible with the current EPC
> >>>> API as only the ->align field in struct pci_epc_features is provided
> >>>> and used for BAR (inbound ATU mappings) mapping. A new API is needed
> >>>> for function drivers to discover mapping constraints and handle
> >>>> non-static requirements based on the RC PCI address range to access.
> >>>>
> >>>> Introduce the function pci_epc_map_align() and the endpoint controller
> >>>> operation ->map_align to allow endpoint function drivers to obtain the
> >>>> size and the offset into a controller address region that must be
> >>>> allocated and mapped to access an RC PCI address region. The size
> >>>> of the mapping provided by pci_epc_map_align() can then be used as the
> >>>> size argument for the function pci_epc_mem_alloc_addr().
> >>>> The offset into the allocated controller memory provided can be used to
> >>>> correctly handle data transfers.
> >>>>
> >>>> For endpoint controllers that have PCI address alignment constraints,
> >>>> pci_epc_map_align() may indicate upon return an effective PCI address
> >>>> region mapping size that is smaller (but not 0) than the requested PCI
> >>>> address region size. For such case, an endpoint function driver must
> >>>> handle data accesses over the desired PCI address range in fragments,
> >>>> by repeatedly using pci_epc_map_align() over the PCI address range.
> >>>>
> >>>> The controller operation ->map_align is optional: controllers that do
> >>>> not have any alignment constraints for mapping a RC PCI address region
> >>>> do not need to implement this operation. For such controllers,
> >>>> pci_epc_map_align() always returns the mapping size as equal to the
> >>>> requested size of the PCI region and an offset equal to 0.
> >>>>
> >>>> The new structure struct pci_epc_map is introduced to represent a
> >>>> mapping start PCI address, mapping effective size, the size and offset
> >>>> into the controller memory needed for mapping the PCI address region as
> >>>> well as the physical and virtual CPU addresses of the mapping (phys_base
> >>>> and virt_base fields). For convenience, the physical and virtual CPU
> >>>> addresses within that mapping to access the target RC PCI address region
> >>>> are also provided (phys_addr and virt_addr fields).
> >>>>
> >>>
> >>> I'm fine with the concept of this patch, but I don't get why you need an API for
> >>> this and not just a callback to be used in the pci_epc_mem_{map/unmap} APIs.
> >>> Furthermore, I don't see an user of this API (in 3 series you've sent out so
> >>> far). Let me know if I failed to spot it.
> >>>
> >>> Also, the API name pci_epc_map_align() sounds like it does the mapping, but it
> >>> doesn't. So I'd not have it exposed as an API at all.
> >>
> >> OK. Fine with me. I will move this inside pci_epc_mem_map(). But note that
> >> without this function, pci_epc_mem_alloc_addr() and pci_epc_map_addr() are
> >> totally useless for EP controllers that have a mapping alignment requirement,
> >> which without the pci_epc_map_align() function, an endpoint function driver
> >> cannot discover *at all* currently. That does not fix the overall API of EPC...
> >>
> > 
> > Not at all. EPF drivers still can use "epf_mhi->epc_features->align" to discover
> > the alignment requirement and calculate the offset on their own (please see
> > pci-epf-mhi). But I'm not in favor of that approach since the APIs need to do
> > that job and that's why I like your pci_epc_mem_map() API.
> 
> That is *not* correct, at least in general. For two reasons:
> 1) epc_features->align defines alignment for BARs, that is, inbound windows
> memory. It is not supposed to be about the outbound windows for mapping PCI
> address space for doing mmio or DMA. Some controllers may have the same
> alignment constraint for both ib and ob, in which case things will work, but
> that is "just being lucky". I spent weeks with the RK3399 understanding that I
> was not lucky with that one :)
> 2) A static alignment constraint does not work for all controllers. C.f. my
> series fixing the RK3399 were I think I clearly explain that alignment of a
> mapping depends on the PCI address AND the size being mapped, as both determine
> the number of bits of address changing within the PCI address range to access.
> Using a fixed boundary alignment for the RK3399 simply does not work at all. An
> epf cannot know that simply looking at a fixed value...
> 
> What you said may be true for the mhi epf, because it requires special hardware
> that has a simple fixed alignment constraint. ntb and vntb are also coded
> assuming such constraint. So If I try to run ntb or vntg on the RK3399 it will
> likely not work (actually it may, but out of sheer luck given that the addresses
> that will be mapped will likely be aligned to 1MB, that is, the memory window size).
> 
> Developping the nvme epf driver where I was seeing completely random PCI
> addresses for command buffers, I could make things work only after developping
> the pci_epc_mem_map() with the controller operation telling the mapping
> (.get_mem_map()) for every address to map.
> 

Fair enough...

> > 
> >> By not having pci_epc_map_align(), pci_epc_mem_alloc_addr() and
> >> pci_epc_map_addr() remain broken, but the introduction of pci_epc_mem_map() does
> >> provide a working solution for the general case.
> >>
> >> So I think we will still need to do something about this bad state of the API later.
> >>
> > 
> > We can always rework the APIs to incorporate the alignment requirement.
> 
> See above. An API that advertise a simple alignment requirement will not work
> for all controllers... But anyway, given that we are not getting any problem
> report, people using the EP framework likely have setups that combine
> controllers and endpoint drivers playing well together. So I do not think there
> is any urgency about the API. I really do need this series for the nvme endpoint
> driver though, as a first step for the API improvement.
> 

No, what I meant was that you can use the new alignment callback (that takes
care of the complex alignment restrictions) in the existing map API to properly
map the addresses for all controllers in the future.

- Mani
Damien Le Moal Oct. 12, 2024, 11:06 a.m. UTC | #6
On 10/12/24 18:40, Manivannan Sadhasivam wrote:
> On Sat, Oct 12, 2024 at 05:30:29PM +0900, Damien Le Moal wrote:
>> On 10/12/24 15:32, Manivannan Sadhasivam wrote:
>>> On Fri, Oct 11, 2024 at 10:07:30AM +0900, Damien Le Moal wrote:
>>>> On 10/10/24 23:36, Manivannan Sadhasivam wrote:
>>>>> On Mon, Oct 07, 2024 at 01:03:15PM +0900, Damien Le Moal wrote:
>>>>>> Some endpoint controllers have requirements on the alignment of the
>>>>>> controller physical memory address that must be used to map a RC PCI
>>>>>> address region. For instance, the rockchip endpoint controller uses
>>>>>> at most the lower 20 bits of a physical memory address region as the
>>>>>> lower bits of an RC PCI address. For mapping a PCI address region of
>>>>>> size bytes starting from pci_addr, the exact number of address bits
>>>>>> used is the number of address bits changing in the address range
>>>>>> [pci_addr..pci_addr + size - 1].
>>>>>>
>>>>>> For this example, this creates the following constraints:
>>>>>> 1) The offset into the controller physical memory allocated for a
>>>>>>    mapping depends on the mapping size *and* the starting PCI address
>>>>>>    for the mapping.
>>>>>> 2) A mapping size cannot exceed the controller windows size (1MB) minus
>>>>>>    the offset needed into the allocated physical memory, which can end
>>>>>>    up being a smaller size than the desired mapping size.
>>>>>>
>>>>>> Handling these constraints independently of the controller being used
>>>>>> in an endpoint function driver is not possible with the current EPC
>>>>>> API as only the ->align field in struct pci_epc_features is provided
>>>>>> and used for BAR (inbound ATU mappings) mapping. A new API is needed
>>>>>> for function drivers to discover mapping constraints and handle
>>>>>> non-static requirements based on the RC PCI address range to access.
>>>>>>
>>>>>> Introduce the function pci_epc_map_align() and the endpoint controller
>>>>>> operation ->map_align to allow endpoint function drivers to obtain the
>>>>>> size and the offset into a controller address region that must be
>>>>>> allocated and mapped to access an RC PCI address region. The size
>>>>>> of the mapping provided by pci_epc_map_align() can then be used as the
>>>>>> size argument for the function pci_epc_mem_alloc_addr().
>>>>>> The offset into the allocated controller memory provided can be used to
>>>>>> correctly handle data transfers.
>>>>>>
>>>>>> For endpoint controllers that have PCI address alignment constraints,
>>>>>> pci_epc_map_align() may indicate upon return an effective PCI address
>>>>>> region mapping size that is smaller (but not 0) than the requested PCI
>>>>>> address region size. For such case, an endpoint function driver must
>>>>>> handle data accesses over the desired PCI address range in fragments,
>>>>>> by repeatedly using pci_epc_map_align() over the PCI address range.
>>>>>>
>>>>>> The controller operation ->map_align is optional: controllers that do
>>>>>> not have any alignment constraints for mapping a RC PCI address region
>>>>>> do not need to implement this operation. For such controllers,
>>>>>> pci_epc_map_align() always returns the mapping size as equal to the
>>>>>> requested size of the PCI region and an offset equal to 0.
>>>>>>
>>>>>> The new structure struct pci_epc_map is introduced to represent a
>>>>>> mapping start PCI address, mapping effective size, the size and offset
>>>>>> into the controller memory needed for mapping the PCI address region as
>>>>>> well as the physical and virtual CPU addresses of the mapping (phys_base
>>>>>> and virt_base fields). For convenience, the physical and virtual CPU
>>>>>> addresses within that mapping to access the target RC PCI address region
>>>>>> are also provided (phys_addr and virt_addr fields).
>>>>>>
>>>>>
>>>>> I'm fine with the concept of this patch, but I don't get why you need an API for
>>>>> this and not just a callback to be used in the pci_epc_mem_{map/unmap} APIs.
>>>>> Furthermore, I don't see an user of this API (in 3 series you've sent out so
>>>>> far). Let me know if I failed to spot it.
>>>>>
>>>>> Also, the API name pci_epc_map_align() sounds like it does the mapping, but it
>>>>> doesn't. So I'd not have it exposed as an API at all.
>>>>
>>>> OK. Fine with me. I will move this inside pci_epc_mem_map(). But note that
>>>> without this function, pci_epc_mem_alloc_addr() and pci_epc_map_addr() are
>>>> totally useless for EP controllers that have a mapping alignment requirement,
>>>> which without the pci_epc_map_align() function, an endpoint function driver
>>>> cannot discover *at all* currently. That does not fix the overall API of EPC...
>>>>
>>>
>>> Not at all. EPF drivers still can use "epf_mhi->epc_features->align" to discover
>>> the alignment requirement and calculate the offset on their own (please see
>>> pci-epf-mhi). But I'm not in favor of that approach since the APIs need to do
>>> that job and that's why I like your pci_epc_mem_map() API.
>>
>> That is *not* correct, at least in general. For two reasons:
>> 1) epc_features->align defines alignment for BARs, that is, inbound windows
>> memory. It is not supposed to be about the outbound windows for mapping PCI
>> address space for doing mmio or DMA. Some controllers may have the same
>> alignment constraint for both ib and ob, in which case things will work, but
>> that is "just being lucky". I spent weeks with the RK3399 understanding that I
>> was not lucky with that one :)
>> 2) A static alignment constraint does not work for all controllers. C.f. my
>> series fixing the RK3399 were I think I clearly explain that alignment of a
>> mapping depends on the PCI address AND the size being mapped, as both determine
>> the number of bits of address changing within the PCI address range to access.
>> Using a fixed boundary alignment for the RK3399 simply does not work at all. An
>> epf cannot know that simply looking at a fixed value...
>>
>> What you said may be true for the mhi epf, because it requires special hardware
>> that has a simple fixed alignment constraint. ntb and vntb are also coded
>> assuming such constraint. So If I try to run ntb or vntg on the RK3399 it will
>> likely not work (actually it may, but out of sheer luck given that the addresses
>> that will be mapped will likely be aligned to 1MB, that is, the memory window size).
>>
>> Developping the nvme epf driver where I was seeing completely random PCI
>> addresses for command buffers, I could make things work only after developping
>> the pci_epc_mem_map() with the controller operation telling the mapping
>> (.get_mem_map()) for every address to map.
>>
> 
> Fair enough...
> 
>>>
>>>> By not having pci_epc_map_align(), pci_epc_mem_alloc_addr() and
>>>> pci_epc_map_addr() remain broken, but the introduction of pci_epc_mem_map() does
>>>> provide a working solution for the general case.
>>>>
>>>> So I think we will still need to do something about this bad state of the API later.
>>>>
>>>
>>> We can always rework the APIs to incorporate the alignment requirement.
>>
>> See above. An API that advertise a simple alignment requirement will not work
>> for all controllers... But anyway, given that we are not getting any problem
>> report, people using the EP framework likely have setups that combine
>> controllers and endpoint drivers playing well together. So I do not think there
>> is any urgency about the API. I really do need this series for the nvme endpoint
>> driver though, as a first step for the API improvement.
>>
> 
> No, what I meant was that you can use the new alignment callback (that takes
> care of the complex alignment restrictions) in the existing map API to properly
> map the addresses for all controllers in the future.

The existing map API cannot alone use ->align_addr() to get the correct mapping.
It is because the memory needed to handle a mapping may be larger than the PCI
address range to map. In fact, it almost always is larger for any controller
that has a constraint. As a result, the memory allocation side
(pci_epc_alloc_addr()) must also be aware of the mapping constraint and
resulting size of the memory to allocate... Hence pci_epc_mem_map() using both
functions.
Manivannan Sadhasivam Oct. 12, 2024, 11:59 a.m. UTC | #7
On Sat, Oct 12, 2024 at 08:06:46PM +0900, Damien Le Moal wrote:
> On 10/12/24 18:40, Manivannan Sadhasivam wrote:
> > On Sat, Oct 12, 2024 at 05:30:29PM +0900, Damien Le Moal wrote:
> >> On 10/12/24 15:32, Manivannan Sadhasivam wrote:
> >>> On Fri, Oct 11, 2024 at 10:07:30AM +0900, Damien Le Moal wrote:
> >>>> On 10/10/24 23:36, Manivannan Sadhasivam wrote:
> >>>>> On Mon, Oct 07, 2024 at 01:03:15PM +0900, Damien Le Moal wrote:
> >>>>>> Some endpoint controllers have requirements on the alignment of the
> >>>>>> controller physical memory address that must be used to map a RC PCI
> >>>>>> address region. For instance, the rockchip endpoint controller uses
> >>>>>> at most the lower 20 bits of a physical memory address region as the
> >>>>>> lower bits of an RC PCI address. For mapping a PCI address region of
> >>>>>> size bytes starting from pci_addr, the exact number of address bits
> >>>>>> used is the number of address bits changing in the address range
> >>>>>> [pci_addr..pci_addr + size - 1].
> >>>>>>
> >>>>>> For this example, this creates the following constraints:
> >>>>>> 1) The offset into the controller physical memory allocated for a
> >>>>>>    mapping depends on the mapping size *and* the starting PCI address
> >>>>>>    for the mapping.
> >>>>>> 2) A mapping size cannot exceed the controller windows size (1MB) minus
> >>>>>>    the offset needed into the allocated physical memory, which can end
> >>>>>>    up being a smaller size than the desired mapping size.
> >>>>>>
> >>>>>> Handling these constraints independently of the controller being used
> >>>>>> in an endpoint function driver is not possible with the current EPC
> >>>>>> API as only the ->align field in struct pci_epc_features is provided
> >>>>>> and used for BAR (inbound ATU mappings) mapping. A new API is needed
> >>>>>> for function drivers to discover mapping constraints and handle
> >>>>>> non-static requirements based on the RC PCI address range to access.
> >>>>>>
> >>>>>> Introduce the function pci_epc_map_align() and the endpoint controller
> >>>>>> operation ->map_align to allow endpoint function drivers to obtain the
> >>>>>> size and the offset into a controller address region that must be
> >>>>>> allocated and mapped to access an RC PCI address region. The size
> >>>>>> of the mapping provided by pci_epc_map_align() can then be used as the
> >>>>>> size argument for the function pci_epc_mem_alloc_addr().
> >>>>>> The offset into the allocated controller memory provided can be used to
> >>>>>> correctly handle data transfers.
> >>>>>>
> >>>>>> For endpoint controllers that have PCI address alignment constraints,
> >>>>>> pci_epc_map_align() may indicate upon return an effective PCI address
> >>>>>> region mapping size that is smaller (but not 0) than the requested PCI
> >>>>>> address region size. For such case, an endpoint function driver must
> >>>>>> handle data accesses over the desired PCI address range in fragments,
> >>>>>> by repeatedly using pci_epc_map_align() over the PCI address range.
> >>>>>>
> >>>>>> The controller operation ->map_align is optional: controllers that do
> >>>>>> not have any alignment constraints for mapping a RC PCI address region
> >>>>>> do not need to implement this operation. For such controllers,
> >>>>>> pci_epc_map_align() always returns the mapping size as equal to the
> >>>>>> requested size of the PCI region and an offset equal to 0.
> >>>>>>
> >>>>>> The new structure struct pci_epc_map is introduced to represent a
> >>>>>> mapping start PCI address, mapping effective size, the size and offset
> >>>>>> into the controller memory needed for mapping the PCI address region as
> >>>>>> well as the physical and virtual CPU addresses of the mapping (phys_base
> >>>>>> and virt_base fields). For convenience, the physical and virtual CPU
> >>>>>> addresses within that mapping to access the target RC PCI address region
> >>>>>> are also provided (phys_addr and virt_addr fields).
> >>>>>>
> >>>>>
> >>>>> I'm fine with the concept of this patch, but I don't get why you need an API for
> >>>>> this and not just a callback to be used in the pci_epc_mem_{map/unmap} APIs.
> >>>>> Furthermore, I don't see an user of this API (in 3 series you've sent out so
> >>>>> far). Let me know if I failed to spot it.
> >>>>>
> >>>>> Also, the API name pci_epc_map_align() sounds like it does the mapping, but it
> >>>>> doesn't. So I'd not have it exposed as an API at all.
> >>>>
> >>>> OK. Fine with me. I will move this inside pci_epc_mem_map(). But note that
> >>>> without this function, pci_epc_mem_alloc_addr() and pci_epc_map_addr() are
> >>>> totally useless for EP controllers that have a mapping alignment requirement,
> >>>> which without the pci_epc_map_align() function, an endpoint function driver
> >>>> cannot discover *at all* currently. That does not fix the overall API of EPC...
> >>>>
> >>>
> >>> Not at all. EPF drivers still can use "epf_mhi->epc_features->align" to discover
> >>> the alignment requirement and calculate the offset on their own (please see
> >>> pci-epf-mhi). But I'm not in favor of that approach since the APIs need to do
> >>> that job and that's why I like your pci_epc_mem_map() API.
> >>
> >> That is *not* correct, at least in general. For two reasons:
> >> 1) epc_features->align defines alignment for BARs, that is, inbound windows
> >> memory. It is not supposed to be about the outbound windows for mapping PCI
> >> address space for doing mmio or DMA. Some controllers may have the same
> >> alignment constraint for both ib and ob, in which case things will work, but
> >> that is "just being lucky". I spent weeks with the RK3399 understanding that I
> >> was not lucky with that one :)
> >> 2) A static alignment constraint does not work for all controllers. C.f. my
> >> series fixing the RK3399 were I think I clearly explain that alignment of a
> >> mapping depends on the PCI address AND the size being mapped, as both determine
> >> the number of bits of address changing within the PCI address range to access.
> >> Using a fixed boundary alignment for the RK3399 simply does not work at all. An
> >> epf cannot know that simply looking at a fixed value...
> >>
> >> What you said may be true for the mhi epf, because it requires special hardware
> >> that has a simple fixed alignment constraint. ntb and vntb are also coded
> >> assuming such constraint. So If I try to run ntb or vntg on the RK3399 it will
> >> likely not work (actually it may, but out of sheer luck given that the addresses
> >> that will be mapped will likely be aligned to 1MB, that is, the memory window size).
> >>
> >> Developping the nvme epf driver where I was seeing completely random PCI
> >> addresses for command buffers, I could make things work only after developping
> >> the pci_epc_mem_map() with the controller operation telling the mapping
> >> (.get_mem_map()) for every address to map.
> >>
> > 
> > Fair enough...
> > 
> >>>
> >>>> By not having pci_epc_map_align(), pci_epc_mem_alloc_addr() and
> >>>> pci_epc_map_addr() remain broken, but the introduction of pci_epc_mem_map() does
> >>>> provide a working solution for the general case.
> >>>>
> >>>> So I think we will still need to do something about this bad state of the API later.
> >>>>
> >>>
> >>> We can always rework the APIs to incorporate the alignment requirement.
> >>
> >> See above. An API that advertise a simple alignment requirement will not work
> >> for all controllers... But anyway, given that we are not getting any problem
> >> report, people using the EP framework likely have setups that combine
> >> controllers and endpoint drivers playing well together. So I do not think there
> >> is any urgency about the API. I really do need this series for the nvme endpoint
> >> driver though, as a first step for the API improvement.
> >>
> > 
> > No, what I meant was that you can use the new alignment callback (that takes
> > care of the complex alignment restrictions) in the existing map API to properly
> > map the addresses for all controllers in the future.
> 
> The existing map API cannot alone use ->align_addr() to get the correct mapping.
> It is because the memory needed to handle a mapping may be larger than the PCI
> address range to map. In fact, it almost always is larger for any controller
> that has a constraint. As a result, the memory allocation side
> (pci_epc_alloc_addr()) must also be aware of the mapping constraint and
> resulting size of the memory to allocate... Hence pci_epc_mem_map() using both
> functions.
> 

Ah, I missed that. Thanks for clarifying!

- Mani
diff mbox series

Patch

diff --git a/drivers/pci/endpoint/pci-epc-core.c b/drivers/pci/endpoint/pci-epc-core.c
index b854f1bab26f..1adccf07c33e 100644
--- a/drivers/pci/endpoint/pci-epc-core.c
+++ b/drivers/pci/endpoint/pci-epc-core.c
@@ -435,6 +435,72 @@  void pci_epc_unmap_addr(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 }
 EXPORT_SYMBOL_GPL(pci_epc_unmap_addr);
 
+/**
+ * pci_epc_map_align() - Get the offset into and the size of a controller memory
+ *			 address region needed to map a RC PCI address region
+ * @epc: the EPC device on which address is allocated
+ * @func_no: the physical endpoint function number in the EPC device
+ * @vfunc_no: the virtual endpoint function number in the physical function
+ * @pci_addr: The RC PCI address to map
+ * @pci_size: the number of bytes to map starting from @pci_addr
+ * @map: populate here the actual size and offset into the controller memory
+ *       that must be allocated for the mapping
+ *
+ * If defined, invoke the controller map_align operation to obtain the size and
+ * the offset into a controller address region that must be allocated to map
+ * @pci_size bytes of the RC PCI address space starting from @pci_addr.
+ *
+ * On return, @map->pci_size indicates the effective size of the mapping that
+ * can be handled by the controller. This size may be smaller than the requested
+ * @pci_size. In such case, the endpoint function driver must handle the mapping
+ * using several fragments. The offset into the controller memory for the
+ * effective mapping of the RC PCI address range
+ * @pci_addr..@pci_addr+@map->pci_size is indicated by @map->map_ofst.
+ *
+ * If the target controller does not define a map_align operation, it is assumed
+ * that the controller has no PCI address mapping alignment constraint.
+ */
+int pci_epc_map_align(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
+		      u64 pci_addr, size_t pci_size, struct pci_epc_map *map)
+{
+	int ret;
+
+	if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
+		return -EINVAL;
+
+	if (!pci_size || !map)
+		return -EINVAL;
+
+	/*
+	 * Initialize and remember the PCI address region to be mapped. The
+	 * controller ->map_align() operation may change the map->pci_size to a
+	 * smaller value.
+	 */
+	memset(map, 0, sizeof(*map));
+	map->pci_addr = pci_addr;
+	map->pci_size = pci_size;
+
+	if (!epc->ops->map_align) {
+		/*
+		 * Assume that the EP controller has no alignment constraint,
+		 * that is, that the PCI address to map and the size of the
+		 * controller memory needed for the mapping are the same as
+		 * specified by the caller.
+		 */
+		map->map_pci_addr = pci_addr;
+		map->map_size = pci_size;
+		map->map_ofst = 0;
+		return 0;
+	}
+
+	mutex_lock(&epc->lock);
+	ret = epc->ops->map_align(epc, func_no, vfunc_no, map);
+	mutex_unlock(&epc->lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(pci_epc_map_align);
+
 /**
  * pci_epc_map_addr() - map CPU address to PCI address
  * @epc: the EPC device on which address is allocated
diff --git a/include/linux/pci-epc.h b/include/linux/pci-epc.h
index 42ef06136bd1..9df8a83e8d10 100644
--- a/include/linux/pci-epc.h
+++ b/include/linux/pci-epc.h
@@ -32,11 +32,44 @@  pci_epc_interface_string(enum pci_epc_interface_type type)
 	}
 }
 
+/**
+ * struct pci_epc_map - information about EPC memory for mapping a RC PCI
+ *                      address range
+ * @pci_addr: start address of the RC PCI address range to map
+ * @pci_size: size of the RC PCI address range mapped from @pci_addr
+ * @map_pci_addr: RC PCI address used as the first address mapped (may be lower
+ *                than @pci_addr)
+ * @map_size: size of the controller memory needed for mapping the RC PCI address
+ *            range @pci_addr..@pci_addr+@pci_size
+ * @map_ofst: offset into the mapped controller memory to access @pci_addr
+ * @phys_base: base physical address of the allocated EPC memory for mapping the
+ *             RC PCI address range
+ * @phys_addr: physical address at which @pci_addr is mapped
+ * @virt_base: base virtual address of the allocated EPC memory for mapping the
+ *             RC PCI address range
+ * @virt_addr: virtual address at which @pci_addr is mapped
+ */
+struct pci_epc_map {
+	phys_addr_t	pci_addr;
+	size_t		pci_size;
+
+	phys_addr_t	map_pci_addr;
+	size_t		map_size;
+	phys_addr_t	map_ofst;
+
+	phys_addr_t	phys_base;
+	phys_addr_t	phys_addr;
+	void __iomem	*virt_base;
+	void __iomem	*virt_addr;
+};
+
 /**
  * struct pci_epc_ops - set of function pointers for performing EPC operations
  * @write_header: ops to populate configuration space header
  * @set_bar: ops to configure the BAR
  * @clear_bar: ops to reset the BAR
+ * @map_align: operation to get the size and offset into a controller memory
+ *             window needed to map an RC PCI address region
  * @map_addr: ops to map CPU address to PCI address
  * @unmap_addr: ops to unmap CPU address and PCI address
  * @set_msi: ops to set the requested number of MSI interrupts in the MSI
@@ -61,6 +94,8 @@  struct pci_epc_ops {
 			   struct pci_epf_bar *epf_bar);
 	void	(*clear_bar)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 			     struct pci_epf_bar *epf_bar);
+	int	(*map_align)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
+			    struct pci_epc_map *map);
 	int	(*map_addr)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 			    phys_addr_t addr, u64 pci_addr, size_t size);
 	void	(*unmap_addr)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
@@ -241,6 +276,8 @@  int pci_epc_set_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 		    struct pci_epf_bar *epf_bar);
 void pci_epc_clear_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 		       struct pci_epf_bar *epf_bar);
+int pci_epc_map_align(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
+		      u64 pci_addr, size_t size, struct pci_epc_map *map);
 int pci_epc_map_addr(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 		     phys_addr_t phys_addr,
 		     u64 pci_addr, size_t size);