diff mbox

[v2,1/1] iommu-api: Add map_range/unmap_range functions

Message ID 1405558917-7597-2-git-send-email-ohaugan@codeaurora.org (mailing list archive)
State New, archived
Headers show

Commit Message

Olav Haugan July 17, 2014, 1:01 a.m. UTC
Mapping and unmapping are more often than not in the critical path.
map_range and unmap_range allows SMMU driver implementations to optimize
the process of mapping and unmapping buffers into the SMMU page tables.
Instead of mapping one physical address, do TLB operation (expensive),
mapping, do TLB operation, mapping, do TLB operation the driver can map
a scatter-gatherlist of physically contiguous pages into one virtual
address space and then at the end do one TLB operation.

Additionally, the mapping operation would be faster in general since
clients does not have to keep calling map API over and over again for
each physically contiguous chunk of memory that needs to be mapped to a
virtually contiguous region.

Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
---
 drivers/iommu/iommu.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/iommu.h | 25 +++++++++++++++++++++++++
 2 files changed, 73 insertions(+)

Comments

Thierry Reding July 17, 2014, 8:21 a.m. UTC | #1
On Wed, Jul 16, 2014 at 06:01:57PM -0700, Olav Haugan wrote:
> Mapping and unmapping are more often than not in the critical path.
> map_range and unmap_range allows SMMU driver implementations to optimize

s/SMMU/IOMMU/

> the process of mapping and unmapping buffers into the SMMU page tables.

s/SMMU/IOMMU/

> Instead of mapping one physical address, do TLB operation (expensive),
> mapping, do TLB operation, mapping, do TLB operation the driver can map
> a scatter-gatherlist of physically contiguous pages into one virtual
> address space and then at the end do one TLB operation.

I find the above hard to read. Maybe:

Instead of mapping a buffer one page at a time and requiring potentially
expensive TLB operations for each page, this function allows the driver
to map all pages in one go and defer TLB maintenance until after all
pages have been mapped.

?

> Additionally, the mapping operation would be faster in general since
> clients does not have to keep calling map API over and over again for
> each physically contiguous chunk of memory that needs to be mapped to a
> virtually contiguous region.
> 
> Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
> ---
>  drivers/iommu/iommu.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/iommu.h | 25 +++++++++++++++++++++++++
>  2 files changed, 73 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 1698360..a0eebb7 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1089,6 +1089,54 @@ size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
>  EXPORT_SYMBOL_GPL(iommu_unmap);
>  
>  
> +int iommu_map_range(struct iommu_domain *domain, unsigned int iova,

Maybe iova should be dma_addr_t? Or at least unsigned long? And perhaps
iommu_map_sg() would be more consistent with the equivalent function in
struct dma_ops?

> +		    struct scatterlist *sg, unsigned int len, int opt)

The length argument seems to be the size of the mapping. Again, the
struct dma_ops function uses this argument to denote the number of
entries in the scatterlist.

opt is somewhat opaque. Perhaps this should be turned into unsigned long
flags? Although given that there aren't any users yet it's difficult to
say what's best here. Perhaps the addition of this argument should be
postponed until there are actual users?

> +{
> +	s32 ret = 0;

Should be int to match the function's return type.

> +	u32 offset = 0;
> +	u32 start_iova = iova;

These should match the type of iova. Also, what's the point of
start_iova if we can simply keep iova constant and use offset where
necessary?

> +	BUG_ON(iova & (~PAGE_MASK));
> +
> +	if (unlikely(domain->ops->map_range == NULL)) {
> +		while (offset < len) {

Maybe this should use for_each_sg()?

> +			phys_addr_t phys = page_to_phys(sg_page(sg));
> +			u32 page_len = PAGE_ALIGN(sg->offset + sg->length);

Shouldn't this alignment be left to iommu_map() to handle? It has code
to deal with that already.

> +			ret = iommu_map(domain, iova, phys, page_len, opt);

This conflates the new opt argument with iommu_map()'s prot argument.
Maybe those two should rather be split?

> +			if (ret)
> +				goto fail;
> +
> +			iova += page_len;
> +			offset += page_len;
> +			if (offset < len)
> +				sg = sg_next(sg);
> +		}
> +	} else {
> +		ret = domain->ops->map_range(domain, iova, sg, len, opt);
> +	}

Perhaps rather than check for a ->map_range implementation everytime a
better option may be to export this generic implementation so that
drivers can set it in their iommu_ops if they don't implement it? So the
contents of the if () block could become a new function:

	int iommu_map_range_generic(...)
	{
		...
	}
	EXPORT_SYMBOL(iommu_map_range_generic);

And drivers would do this:

	static const struct iommu_ops driver_iommu_ops = {
		...
		.map_range = iommu_map_range_generic,
		...
	};

> +int iommu_unmap_range(struct iommu_domain *domain, unsigned int iova,
> +		      unsigned int len, int opt)

Some comments regarding function name and argument types as for
iommu_map_range().

> +static inline int iommu_map_range(struct iommu_domain *domain,
> +				  unsigned int iova, struct scatterlist *sg,
> +				  unsigned int len, int opt)
> +{
> +	return -ENODEV;

I know other IOMMU API dummies already use this error code, but I find
it to be a little confusing. The dummies are used when the IOMMU API is
disabled via Kconfig, so -ENOSYS (Function not implemented) seems like a
more useful error.

Thierry
Olav Haugan July 22, 2014, 12:59 a.m. UTC | #2
On 7/17/2014 1:21 AM, Thierry Reding wrote:
> On Wed, Jul 16, 2014 at 06:01:57PM -0700, Olav Haugan wrote:
>> Mapping and unmapping are more often than not in the critical path.
>> map_range and unmap_range allows SMMU driver implementations to optimize
> 
> s/SMMU/IOMMU/
>
>> the process of mapping and unmapping buffers into the SMMU page tables.
> 
> s/SMMU/IOMMU/
> 
>> Instead of mapping one physical address, do TLB operation (expensive),
>> mapping, do TLB operation, mapping, do TLB operation the driver can map
>> a scatter-gatherlist of physically contiguous pages into one virtual
>> address space and then at the end do one TLB operation.
> 
> I find the above hard to read. Maybe:
> 
> Instead of mapping a buffer one page at a time and requiring potentially
> expensive TLB operations for each page, this function allows the driver
> to map all pages in one go and defer TLB maintenance until after all
> pages have been mapped.

Yeah, all above is OK with me.

> 
>> Additionally, the mapping operation would be faster in general since
>> clients does not have to keep calling map API over and over again for
>> each physically contiguous chunk of memory that needs to be mapped to a
>> virtually contiguous region.
>>
>> Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
>> ---
>>  drivers/iommu/iommu.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  include/linux/iommu.h | 25 +++++++++++++++++++++++++
>>  2 files changed, 73 insertions(+)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 1698360..a0eebb7 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -1089,6 +1089,54 @@ size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
>>  EXPORT_SYMBOL_GPL(iommu_unmap);
>>  
>>  
>> +int iommu_map_range(struct iommu_domain *domain, unsigned int iova,
> 
> Maybe iova should be dma_addr_t? Or at least unsigned long? And perhaps
> iommu_map_sg() would be more consistent with the equivalent function in
> struct dma_ops?
> 
>> +		    struct scatterlist *sg, unsigned int len, int opt)
> 
> The length argument seems to be the size of the mapping. Again, the
> struct dma_ops function uses this argument to denote the number of
> entries in the scatterlist.
> 
> opt is somewhat opaque. Perhaps this should be turned into unsigned long
> flags? Although given that there aren't any users yet it's difficult to
> say what's best here. Perhaps the addition of this argument should be
> postponed until there are actual users?

I am thinking something like this:

int iommu_map_sg(struct iommu_domain *domain, struct scatterlist *sg,
unsigned int nents, int prot, unsigned long flags);
int iommu_unmap_sg(struct iommu_domain *domain, struct scatterlist *sg,
unsigned int nents, unsigned long flags);

The iova is contained within sg so we don't need that argument really
and I would like to keep the flags argument. I would prefer not to
change the API after it has been published which could potentially
affect a lot of call sites.

>> +{
>> +	s32 ret = 0;
> 
> Should be int to match the function's return type.
> 
>> +	u32 offset = 0;
>> +	u32 start_iova = iova;
> 
> These should match the type of iova. Also, what's the point of
> start_iova if we can simply keep iova constant and use offset where
> necessary?
> 
>> +	BUG_ON(iova & (~PAGE_MASK));
>> +
>> +	if (unlikely(domain->ops->map_range == NULL)) {
>> +		while (offset < len) {
> 
> Maybe this should use for_each_sg()?
> 
>> +			phys_addr_t phys = page_to_phys(sg_page(sg));
>> +			u32 page_len = PAGE_ALIGN(sg->offset + sg->length);
> 
> Shouldn't this alignment be left to iommu_map() to handle? It has code
> to deal with that already.

I don't see page alignment in the iommu_map function. I only see a check
whether the (iova | paddr | size) is aligned to the minimum page size
and then it errors out if it isn't....


>> +			ret = iommu_map(domain, iova, phys, page_len, opt);
> 
> This conflates the new opt argument with iommu_map()'s prot argument.
> Maybe those two should rather be split?
> 
>> +			if (ret)
>> +				goto fail;
>> +
>> +			iova += page_len;
>> +			offset += page_len;
>> +			if (offset < len)
>> +				sg = sg_next(sg);
>> +		}
>> +	} else {
>> +		ret = domain->ops->map_range(domain, iova, sg, len, opt);
>> +	}
> 
> Perhaps rather than check for a ->map_range implementation everytime a
> better option may be to export this generic implementation so that
> drivers can set it in their iommu_ops if they don't implement it? So the
> contents of the if () block could become a new function:
> 
> 	int iommu_map_range_generic(...)
> 	{
> 		...
> 	}
> 	EXPORT_SYMBOL(iommu_map_range_generic);
> 
> And drivers would do this:
> 
> 	static const struct iommu_ops driver_iommu_ops = {
> 		...
> 		.map_range = iommu_map_range_generic,
> 		...
> 	};

I'd like to keep the new API consistent with the rest of the API. Most
if not all of the other APIs always check if the operation is non-NULL.
A new driver could choose not to set the .map_range callback. I think it
is better to keep this consistent with the behavior of the other APIs.

>> +int iommu_unmap_range(struct iommu_domain *domain, unsigned int iova,
>> +		      unsigned int len, int opt)
> 
> Some comments regarding function name and argument types as for
> iommu_map_range().
> 
>> +static inline int iommu_map_range(struct iommu_domain *domain,
>> +				  unsigned int iova, struct scatterlist *sg,
>> +				  unsigned int len, int opt)
>> +{
>> +	return -ENODEV;
> 
> I know other IOMMU API dummies already use this error code, but I find
> it to be a little confusing. The dummies are used when the IOMMU API is
> disabled via Kconfig, so -ENOSYS (Function not implemented) seems like a
> more useful error.

Again, I would prefer to keep this consistent with the other APIs
already there. iommu_map/iommu_unmap both returns -ENODEV. If we want to
change this I think this should be done as a separate patch that changed
all of them to be consistent.

Thanks,

Olav
Thierry Reding July 22, 2014, 7:45 a.m. UTC | #3
On Mon, Jul 21, 2014 at 05:59:22PM -0700, Olav Haugan wrote:
> On 7/17/2014 1:21 AM, Thierry Reding wrote:
> > On Wed, Jul 16, 2014 at 06:01:57PM -0700, Olav Haugan wrote:
[...]
> > > Additionally, the mapping operation would be faster in general since
> > > clients does not have to keep calling map API over and over again for
> > > each physically contiguous chunk of memory that needs to be mapped to a
> > > virtually contiguous region.
> > >
> > > Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
> > > ---
> > >  drivers/iommu/iommu.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
> > >  include/linux/iommu.h | 25 +++++++++++++++++++++++++
> > >  2 files changed, 73 insertions(+)
> > >
> > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > > index 1698360..a0eebb7 100644
> > > --- a/drivers/iommu/iommu.c
> > > +++ b/drivers/iommu/iommu.c
> > > @@ -1089,6 +1089,54 @@ size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
> > >  EXPORT_SYMBOL_GPL(iommu_unmap);
> > >  
> > >  
> > > +int iommu_map_range(struct iommu_domain *domain, unsigned int iova,
> > 
> > Maybe iova should be dma_addr_t? Or at least unsigned long? And perhaps
> > iommu_map_sg() would be more consistent with the equivalent function in
> > struct dma_ops?
> > 
> >> +		    struct scatterlist *sg, unsigned int len, int opt)
> > 
> > The length argument seems to be the size of the mapping. Again, the
> > struct dma_ops function uses this argument to denote the number of
> > entries in the scatterlist.
> > 
> > opt is somewhat opaque. Perhaps this should be turned into unsigned long
> > flags? Although given that there aren't any users yet it's difficult to
> > say what's best here. Perhaps the addition of this argument should be
> > postponed until there are actual users?
> 
> I am thinking something like this:
> 
> int iommu_map_sg(struct iommu_domain *domain, struct scatterlist *sg,
> unsigned int nents, int prot, unsigned long flags);
> int iommu_unmap_sg(struct iommu_domain *domain, struct scatterlist *sg,
> unsigned int nents, unsigned long flags);

Looks good.

> The iova is contained within sg so we don't need that argument really

I'm not sure. I think a common use-case for this function is for some
driver to map an imported DMA-BUF. In that case you'll get a struct
sg_table, but I think it won't have sg.dma_address set to anything
useful. So if we don't have iova as a parameter to this function, the
driver would have to make it a two-step process, like this:

	sg_dma_address(sg) = iova;

	err = iommu_map_sg(...);

And drivers that use the IOMMU API directly need to manage IOVA space
themselves anyway, so I think passing around the IOVA within the SG
won't be a very common case. It will almost always be the driver that
calls this function which allocates the IOVA range.

> and I would like to keep the flags argument. I would prefer not to
> change the API after it has been published which could potentially
> affect a lot of call sites.

We have pretty good tools to help with this kind of mechanical
conversion, so I don't think changing the API will be much of a problem.
However it seems likely that we'll want to specify flags eventually, so
I don't have any strong objections to keeping that parameter.

> >> +			phys_addr_t phys = page_to_phys(sg_page(sg));
> >> +			u32 page_len = PAGE_ALIGN(sg->offset + sg->length);
> > 
> > Shouldn't this alignment be left to iommu_map() to handle? It has code
> > to deal with that already.
> 
> I don't see page alignment in the iommu_map function. I only see a check
> whether the (iova | paddr | size) is aligned to the minimum page size
> and then it errors out if it isn't....

Indeed, the above doesn't do what I thought it did.

> > Perhaps rather than check for a ->map_range implementation everytime a
> > better option may be to export this generic implementation so that
> > drivers can set it in their iommu_ops if they don't implement it? So the
> > contents of the if () block could become a new function:
> > 
> > 	int iommu_map_range_generic(...)
> > 	{
> > 		...
> > 	}
> > 	EXPORT_SYMBOL(iommu_map_range_generic);
> > 
> > And drivers would do this:
> > 
> > 	static const struct iommu_ops driver_iommu_ops = {
> > 		...
> > 		.map_range = iommu_map_range_generic,
> > 		...
> > 	};
> 
> I'd like to keep the new API consistent with the rest of the API. Most
> if not all of the other APIs always check if the operation is non-NULL.

But that's because the other operations are either optional or they
don't provide a fallback implementation. For .map_sg() the issue is
different because the core provides a slow fallback. The disadvantage
of keeping the check within iommu_map_sg() is that you have this check
every time the function is called. If on the other hand drivers
specifically set a pointer (either a custom implementation or the
wrapper around iommu_map()) you'll simply call the driver function no
matter what and don't have to special case.

It also makes it possible for drivers to opt-in to using the generic
fallback. Currently if driver writers don't set it explicitly they'll
silently get a fallback implementation and they may not even notice.

> A new driver could choose not to set the .map_range callback. I think it
> is better to keep this consistent with the behavior of the other APIs.

I'd argue that it's more consistent to not provide the fallback by
default. None of the other functions do that. If the driver doesn't
implement a callback then the iommu_*() functions will return an error.

For .map_sg() I think pretty much every driver will want to implement
it, so requiring developers to explicitly set it for their driver seems
like a good idea. If there's no advantage in implementing it then they
can always get the same functionality by explicitly using the fallback.

> >> +int iommu_unmap_range(struct iommu_domain *domain, unsigned int iova,
> >> +		      unsigned int len, int opt)
> > 
> > Some comments regarding function name and argument types as for
> > iommu_map_range().
> > 
> >> +static inline int iommu_map_range(struct iommu_domain *domain,
> >> +				  unsigned int iova, struct scatterlist *sg,
> >> +				  unsigned int len, int opt)
> >> +{
> >> +	return -ENODEV;
> > 
> > I know other IOMMU API dummies already use this error code, but I find
> > it to be a little confusing. The dummies are used when the IOMMU API is
> > disabled via Kconfig, so -ENOSYS (Function not implemented) seems like a
> > more useful error.
> 
> Again, I would prefer to keep this consistent with the other APIs
> already there. iommu_map/iommu_unmap both returns -ENODEV. If we want to
> change this I think this should be done as a separate patch that changed
> all of them to be consistent.

Fair enough.

Thierry
Rob Clark July 22, 2014, 3:07 p.m. UTC | #4
On Mon, Jul 21, 2014 at 8:59 PM, Olav Haugan <ohaugan@codeaurora.org> wrote:
> On 7/17/2014 1:21 AM, Thierry Reding wrote:
>> On Wed, Jul 16, 2014 at 06:01:57PM -0700, Olav Haugan wrote:
>>> Mapping and unmapping are more often than not in the critical path.
>>> map_range and unmap_range allows SMMU driver implementations to optimize
>>
>> s/SMMU/IOMMU/
>>
>>> the process of mapping and unmapping buffers into the SMMU page tables.
>>
>> s/SMMU/IOMMU/
>>
>>> Instead of mapping one physical address, do TLB operation (expensive),
>>> mapping, do TLB operation, mapping, do TLB operation the driver can map
>>> a scatter-gatherlist of physically contiguous pages into one virtual
>>> address space and then at the end do one TLB operation.
>>
>> I find the above hard to read. Maybe:
>>
>> Instead of mapping a buffer one page at a time and requiring potentially
>> expensive TLB operations for each page, this function allows the driver
>> to map all pages in one go and defer TLB maintenance until after all
>> pages have been mapped.
>
> Yeah, all above is OK with me.
>
>>
>>> Additionally, the mapping operation would be faster in general since
>>> clients does not have to keep calling map API over and over again for
>>> each physically contiguous chunk of memory that needs to be mapped to a
>>> virtually contiguous region.
>>>
>>> Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
>>> ---
>>>  drivers/iommu/iommu.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>  include/linux/iommu.h | 25 +++++++++++++++++++++++++
>>>  2 files changed, 73 insertions(+)
>>>
>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>> index 1698360..a0eebb7 100644
>>> --- a/drivers/iommu/iommu.c
>>> +++ b/drivers/iommu/iommu.c
>>> @@ -1089,6 +1089,54 @@ size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
>>>  EXPORT_SYMBOL_GPL(iommu_unmap);
>>>
>>>
>>> +int iommu_map_range(struct iommu_domain *domain, unsigned int iova,
>>
>> Maybe iova should be dma_addr_t? Or at least unsigned long? And perhaps
>> iommu_map_sg() would be more consistent with the equivalent function in
>> struct dma_ops?
>>
>>> +                struct scatterlist *sg, unsigned int len, int opt)
>>
>> The length argument seems to be the size of the mapping. Again, the
>> struct dma_ops function uses this argument to denote the number of
>> entries in the scatterlist.
>>
>> opt is somewhat opaque. Perhaps this should be turned into unsigned long
>> flags? Although given that there aren't any users yet it's difficult to
>> say what's best here. Perhaps the addition of this argument should be
>> postponed until there are actual users?
>
> I am thinking something like this:
>
> int iommu_map_sg(struct iommu_domain *domain, struct scatterlist *sg,
> unsigned int nents, int prot, unsigned long flags);
> int iommu_unmap_sg(struct iommu_domain *domain, struct scatterlist *sg,
> unsigned int nents, unsigned long flags);
>
> The iova is contained within sg so we don't need that argument really
> and I would like to keep the flags argument. I would prefer not to
> change the API after it has been published which could potentially
> affect a lot of call sites.

ugg.. that at least forces me to construct a separate sg for mapping
the same buffer in multiple process's gpu addr space.  Not really a
fan of that.

BR,
-R

>>> +{
>>> +    s32 ret = 0;
>>
>> Should be int to match the function's return type.
>>
>>> +    u32 offset = 0;
>>> +    u32 start_iova = iova;
>>
>> These should match the type of iova. Also, what's the point of
>> start_iova if we can simply keep iova constant and use offset where
>> necessary?
>>
>>> +    BUG_ON(iova & (~PAGE_MASK));
>>> +
>>> +    if (unlikely(domain->ops->map_range == NULL)) {
>>> +            while (offset < len) {
>>
>> Maybe this should use for_each_sg()?
>>
>>> +                    phys_addr_t phys = page_to_phys(sg_page(sg));
>>> +                    u32 page_len = PAGE_ALIGN(sg->offset + sg->length);
>>
>> Shouldn't this alignment be left to iommu_map() to handle? It has code
>> to deal with that already.
>
> I don't see page alignment in the iommu_map function. I only see a check
> whether the (iova | paddr | size) is aligned to the minimum page size
> and then it errors out if it isn't....
>
>
>>> +                    ret = iommu_map(domain, iova, phys, page_len, opt);
>>
>> This conflates the new opt argument with iommu_map()'s prot argument.
>> Maybe those two should rather be split?
>>
>>> +                    if (ret)
>>> +                            goto fail;
>>> +
>>> +                    iova += page_len;
>>> +                    offset += page_len;
>>> +                    if (offset < len)
>>> +                            sg = sg_next(sg);
>>> +            }
>>> +    } else {
>>> +            ret = domain->ops->map_range(domain, iova, sg, len, opt);
>>> +    }
>>
>> Perhaps rather than check for a ->map_range implementation everytime a
>> better option may be to export this generic implementation so that
>> drivers can set it in their iommu_ops if they don't implement it? So the
>> contents of the if () block could become a new function:
>>
>>       int iommu_map_range_generic(...)
>>       {
>>               ...
>>       }
>>       EXPORT_SYMBOL(iommu_map_range_generic);
>>
>> And drivers would do this:
>>
>>       static const struct iommu_ops driver_iommu_ops = {
>>               ...
>>               .map_range = iommu_map_range_generic,
>>               ...
>>       };
>
> I'd like to keep the new API consistent with the rest of the API. Most
> if not all of the other APIs always check if the operation is non-NULL.
> A new driver could choose not to set the .map_range callback. I think it
> is better to keep this consistent with the behavior of the other APIs.
>
>>> +int iommu_unmap_range(struct iommu_domain *domain, unsigned int iova,
>>> +                  unsigned int len, int opt)
>>
>> Some comments regarding function name and argument types as for
>> iommu_map_range().
>>
>>> +static inline int iommu_map_range(struct iommu_domain *domain,
>>> +                              unsigned int iova, struct scatterlist *sg,
>>> +                              unsigned int len, int opt)
>>> +{
>>> +    return -ENODEV;
>>
>> I know other IOMMU API dummies already use this error code, but I find
>> it to be a little confusing. The dummies are used when the IOMMU API is
>> disabled via Kconfig, so -ENOSYS (Function not implemented) seems like a
>> more useful error.
>
> Again, I would prefer to keep this consistent with the other APIs
> already there. iommu_map/iommu_unmap both returns -ENODEV. If we want to
> change this I think this should be done as a separate patch that changed
> all of them to be consistent.
>
> Thanks,
>
> Olav
>
> --
> The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Olav Haugan July 23, 2014, 5:49 p.m. UTC | #5
On 7/22/2014 12:45 AM, Thierry Reding wrote:
> On Mon, Jul 21, 2014 at 05:59:22PM -0700, Olav Haugan wrote:
>> On 7/17/2014 1:21 AM, Thierry Reding wrote:
>>> On Wed, Jul 16, 2014 at 06:01:57PM -0700, Olav Haugan wrote:
> [...]
>>>> Additionally, the mapping operation would be faster in general since
>>>> clients does not have to keep calling map API over and over again for
>>>> each physically contiguous chunk of memory that needs to be mapped to a
>>>> virtually contiguous region.
>>>>
>>>> Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
>>>> ---
>>>>  drivers/iommu/iommu.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  include/linux/iommu.h | 25 +++++++++++++++++++++++++
>>>>  2 files changed, 73 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>>> index 1698360..a0eebb7 100644
>>>> --- a/drivers/iommu/iommu.c
>>>> +++ b/drivers/iommu/iommu.c
>>>> @@ -1089,6 +1089,54 @@ size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
>>>>  EXPORT_SYMBOL_GPL(iommu_unmap);
>>>>  
>>>>  
>>>> +int iommu_map_range(struct iommu_domain *domain, unsigned int iova,
>>>
>>> Maybe iova should be dma_addr_t? Or at least unsigned long? And perhaps
>>> iommu_map_sg() would be more consistent with the equivalent function in
>>> struct dma_ops?
>>>
>>>> +		    struct scatterlist *sg, unsigned int len, int opt)
>>>
>>> The length argument seems to be the size of the mapping. Again, the
>>> struct dma_ops function uses this argument to denote the number of
>>> entries in the scatterlist.
>>>
>>> opt is somewhat opaque. Perhaps this should be turned into unsigned long
>>> flags? Although given that there aren't any users yet it's difficult to
>>> say what's best here. Perhaps the addition of this argument should be
>>> postponed until there are actual users?
>>
>> I am thinking something like this:
>>
>> int iommu_map_sg(struct iommu_domain *domain, struct scatterlist *sg,
>> unsigned int nents, int prot, unsigned long flags);
>> int iommu_unmap_sg(struct iommu_domain *domain, struct scatterlist *sg,
>> unsigned int nents, unsigned long flags);
> 
> Looks good.
> 
>> The iova is contained within sg so we don't need that argument really
> 
> I'm not sure. I think a common use-case for this function is for some
> driver to map an imported DMA-BUF. In that case you'll get a struct
> sg_table, but I think it won't have sg.dma_address set to anything
> useful. So if we don't have iova as a parameter to this function, the
> driver would have to make it a two-step process, like this:
> 
> 	sg_dma_address(sg) = iova;
> 
> 	err = iommu_map_sg(...);
> 
> And drivers that use the IOMMU API directly need to manage IOVA space
> themselves anyway, so I think passing around the IOVA within the SG
> won't be a very common case. It will almost always be the driver that
> calls this function which allocates the IOVA range.

Yes, I see your point. Rob is not a fan either...
So what about this:

int iommu_map_sg(struct iommu_domain *domain, unsigned long iova, struct
scatterlist *sg, unsigned int nents, int prot, unsigned long flags);
int iommu_unmap_sg(struct iommu_domain *domain, unsigned long iova,
size_t size, unsigned long flags);

No need for sg in the unmap call then. Keeping iova an unsigned long to
match the existing iommu_map/iommu_unmap calls.


>> and I would like to keep the flags argument. I would prefer not to
>> change the API after it has been published which could potentially
>> affect a lot of call sites.
> 
> We have pretty good tools to help with this kind of mechanical
> conversion, so I don't think changing the API will be much of a problem.
> However it seems likely that we'll want to specify flags eventually, so
> I don't have any strong objections to keeping that parameter.
> 
>>>> +			phys_addr_t phys = page_to_phys(sg_page(sg));
>>>> +			u32 page_len = PAGE_ALIGN(sg->offset + sg->length);
>>>
>>> Shouldn't this alignment be left to iommu_map() to handle? It has code
>>> to deal with that already.
>>
>> I don't see page alignment in the iommu_map function. I only see a check
>> whether the (iova | paddr | size) is aligned to the minimum page size
>> and then it errors out if it isn't....
> 
> Indeed, the above doesn't do what I thought it did.
> 
>>> Perhaps rather than check for a ->map_range implementation everytime a
>>> better option may be to export this generic implementation so that
>>> drivers can set it in their iommu_ops if they don't implement it? So the
>>> contents of the if () block could become a new function:
>>>
>>> 	int iommu_map_range_generic(...)
>>> 	{
>>> 		...
>>> 	}
>>> 	EXPORT_SYMBOL(iommu_map_range_generic);
>>>
>>> And drivers would do this:
>>>
>>> 	static const struct iommu_ops driver_iommu_ops = {
>>> 		...
>>> 		.map_range = iommu_map_range_generic,
>>> 		...
>>> 	};
>>
>> I'd like to keep the new API consistent with the rest of the API. Most
>> if not all of the other APIs always check if the operation is non-NULL.
> 
> But that's because the other operations are either optional or they
> don't provide a fallback implementation. For .map_sg() the issue is
> different because the core provides a slow fallback. The disadvantage
> of keeping the check within iommu_map_sg() is that you have this check
> every time the function is called. If on the other hand drivers
> specifically set a pointer (either a custom implementation or the
> wrapper around iommu_map()) you'll simply call the driver function no
> matter what and don't have to special case.
> 
> It also makes it possible for drivers to opt-in to using the generic
> fallback. Currently if driver writers don't set it explicitly they'll
> silently get a fallback implementation and they may not even notice.
>
>> A new driver could choose not to set the .map_range callback. I think it
>> is better to keep this consistent with the behavior of the other APIs.
> 
> I'd argue that it's more consistent to not provide the fallback by
> default. None of the other functions do that. If the driver doesn't
> implement a callback then the iommu_*() functions will return an error.
> 
> For .map_sg() I think pretty much every driver will want to implement
> it, so requiring developers to explicitly set it for their driver seems
> like a good idea. If there's no advantage in implementing it then they
> can always get the same functionality by explicitly using the fallback.

I feel that requiring drivers to set the default callback defeats the
purpose of having a fallback in the first place. The reason to provide
the default fallback is to catch any driver that does not implement this
themselves.

Joerg, can you comment on what you envisioned when you suggested that we
add the fallback?

>>>> +int iommu_unmap_range(struct iommu_domain *domain, unsigned int iova,
>>>> +		      unsigned int len, int opt)
>>>
>>> Some comments regarding function name and argument types as for
>>> iommu_map_range().
>>>
>>>> +static inline int iommu_map_range(struct iommu_domain *domain,
>>>> +				  unsigned int iova, struct scatterlist *sg,
>>>> +				  unsigned int len, int opt)
>>>> +{
>>>> +	return -ENODEV;
>>>
>>> I know other IOMMU API dummies already use this error code, but I find
>>> it to be a little confusing. The dummies are used when the IOMMU API is
>>> disabled via Kconfig, so -ENOSYS (Function not implemented) seems like a
>>> more useful error.
>>
>> Again, I would prefer to keep this consistent with the other APIs
>> already there. iommu_map/iommu_unmap both returns -ENODEV. If we want to
>> change this I think this should be done as a separate patch that changed
>> all of them to be consistent.
> 
> Fair enough.
> 
> Thierry
> 

Olav
Joerg Roedel July 24, 2014, 9:34 a.m. UTC | #6
On Wed, Jul 23, 2014 at 10:49:55AM -0700, Olav Haugan wrote:
> Joerg, can you comment on what you envisioned when you suggested that we
> add the fallback?
> 

The problem is that we already have tons of IOMMU drivers in the tree
which don't provide these call-backs. So adding this API extension
without a fall-back that works for these drivers too would fragment the
functionality between different IOMMU drivers in an inacceptable way and
undermine the purpose of a generic API.


	Joerg

--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding July 24, 2014, 10:07 a.m. UTC | #7
On Thu, Jul 24, 2014 at 11:34:27AM +0200, Joerg Roedel wrote:
> On Wed, Jul 23, 2014 at 10:49:55AM -0700, Olav Haugan wrote:
> > Joerg, can you comment on what you envisioned when you suggested that we
> > add the fallback?
> > 
> 
> The problem is that we already have tons of IOMMU drivers in the tree
> which don't provide these call-backs. So adding this API extension
> without a fall-back that works for these drivers too would fragment the
> functionality between different IOMMU drivers in an inacceptable way and
> undermine the purpose of a generic API.

But we only care about in-tree drivers anyway, so we can equally well
just point all drivers to the generic implementation in the same patch
that adds this new function. The end result will be the same, but it
will keep the core function simpler (and more consistent with the other
core functions).

Thierry
Thierry Reding July 24, 2014, 10:14 a.m. UTC | #8
On Wed, Jul 23, 2014 at 10:49:55AM -0700, Olav Haugan wrote:
> On 7/22/2014 12:45 AM, Thierry Reding wrote:
> > On Mon, Jul 21, 2014 at 05:59:22PM -0700, Olav Haugan wrote:
> >> On 7/17/2014 1:21 AM, Thierry Reding wrote:
> >>> On Wed, Jul 16, 2014 at 06:01:57PM -0700, Olav Haugan wrote:
> > [...]
> >>>> Additionally, the mapping operation would be faster in general since
> >>>> clients does not have to keep calling map API over and over again for
> >>>> each physically contiguous chunk of memory that needs to be mapped to a
> >>>> virtually contiguous region.
> >>>>
> >>>> Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
> >>>> ---
> >>>>  drivers/iommu/iommu.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>  include/linux/iommu.h | 25 +++++++++++++++++++++++++
> >>>>  2 files changed, 73 insertions(+)
> >>>>
> >>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> >>>> index 1698360..a0eebb7 100644
> >>>> --- a/drivers/iommu/iommu.c
> >>>> +++ b/drivers/iommu/iommu.c
> >>>> @@ -1089,6 +1089,54 @@ size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
> >>>>  EXPORT_SYMBOL_GPL(iommu_unmap);
> >>>>  
> >>>>  
> >>>> +int iommu_map_range(struct iommu_domain *domain, unsigned int iova,
> >>>
> >>> Maybe iova should be dma_addr_t? Or at least unsigned long? And perhaps
> >>> iommu_map_sg() would be more consistent with the equivalent function in
> >>> struct dma_ops?
> >>>
> >>>> +		    struct scatterlist *sg, unsigned int len, int opt)
> >>>
> >>> The length argument seems to be the size of the mapping. Again, the
> >>> struct dma_ops function uses this argument to denote the number of
> >>> entries in the scatterlist.
> >>>
> >>> opt is somewhat opaque. Perhaps this should be turned into unsigned long
> >>> flags? Although given that there aren't any users yet it's difficult to
> >>> say what's best here. Perhaps the addition of this argument should be
> >>> postponed until there are actual users?
> >>
> >> I am thinking something like this:
> >>
> >> int iommu_map_sg(struct iommu_domain *domain, struct scatterlist *sg,
> >> unsigned int nents, int prot, unsigned long flags);
> >> int iommu_unmap_sg(struct iommu_domain *domain, struct scatterlist *sg,
> >> unsigned int nents, unsigned long flags);
> > 
> > Looks good.
> > 
> >> The iova is contained within sg so we don't need that argument really
> > 
> > I'm not sure. I think a common use-case for this function is for some
> > driver to map an imported DMA-BUF. In that case you'll get a struct
> > sg_table, but I think it won't have sg.dma_address set to anything
> > useful. So if we don't have iova as a parameter to this function, the
> > driver would have to make it a two-step process, like this:
> > 
> > 	sg_dma_address(sg) = iova;
> > 
> > 	err = iommu_map_sg(...);
> > 
> > And drivers that use the IOMMU API directly need to manage IOVA space
> > themselves anyway, so I think passing around the IOVA within the SG
> > won't be a very common case. It will almost always be the driver that
> > calls this function which allocates the IOVA range.
> 
> Yes, I see your point. Rob is not a fan either...
> So what about this:
> 
> int iommu_map_sg(struct iommu_domain *domain, unsigned long iova, struct
> scatterlist *sg, unsigned int nents, int prot, unsigned long flags);
> int iommu_unmap_sg(struct iommu_domain *domain, unsigned long iova,
> size_t size, unsigned long flags);
> 
> No need for sg in the unmap call then. Keeping iova an unsigned long to
> match the existing iommu_map/iommu_unmap calls.

Looks good to me. I think we may want to eventually convert iova to
dma_addr_t since that's a more appropriate type for these addresses but
we can do that in a separate patch later on.

[...]
> > For .map_sg() I think pretty much every driver will want to implement
> > it, so requiring developers to explicitly set it for their driver seems
> > like a good idea. If there's no advantage in implementing it then they
> > can always get the same functionality by explicitly using the fallback.
> 
> I feel that requiring drivers to set the default callback defeats the
> purpose of having a fallback in the first place. The reason to provide
> the default fallback is to catch any driver that does not implement this
> themselves.

Certainly, but the exact same can be achieved by making all drivers
point to the generic implementation. In my opinion that makes it much
more explicit (and therefore obvious) what's really going on. Hiding
fallbacks in the core API obfuscates.

And this is all in-tree code, so we have full control over what we
change, so the patch that introduces this new API could simply make the
necessary adjustments to existing drivers.

Thierry
diff mbox

Patch

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 1698360..a0eebb7 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1089,6 +1089,54 @@  size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
 EXPORT_SYMBOL_GPL(iommu_unmap);
 
 
+int iommu_map_range(struct iommu_domain *domain, unsigned int iova,
+		    struct scatterlist *sg, unsigned int len, int opt)
+{
+	s32 ret = 0;
+	u32 offset = 0;
+	u32 start_iova = iova;
+
+	BUG_ON(iova & (~PAGE_MASK));
+
+	if (unlikely(domain->ops->map_range == NULL)) {
+		while (offset < len) {
+			phys_addr_t phys = page_to_phys(sg_page(sg));
+			u32 page_len = PAGE_ALIGN(sg->offset + sg->length);
+
+			ret = iommu_map(domain, iova, phys, page_len, opt);
+			if (ret)
+				goto fail;
+
+			iova += page_len;
+			offset += page_len;
+			if (offset < len)
+				sg = sg_next(sg);
+		}
+	} else {
+		ret = domain->ops->map_range(domain, iova, sg, len, opt);
+	}
+	goto out;
+
+fail:
+	/* undo mappings already done in case of error */
+	iommu_unmap(domain, start_iova, offset);
+out:
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_map_range);
+
+int iommu_unmap_range(struct iommu_domain *domain, unsigned int iova,
+		      unsigned int len, int opt)
+{
+	BUG_ON(iova & (~PAGE_MASK));
+
+	if (unlikely(domain->ops->unmap_range == NULL))
+		return iommu_unmap(domain, iova, len);
+	else
+		return domain->ops->unmap_range(domain, iova, len, opt);
+}
+EXPORT_SYMBOL_GPL(iommu_unmap_range);
+
 int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr,
 			       phys_addr_t paddr, u64 size, int prot)
 {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c7097d7..54c836e 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -22,6 +22,7 @@ 
 #include <linux/errno.h>
 #include <linux/err.h>
 #include <linux/types.h>
+#include <linux/scatterlist.h>
 #include <trace/events/iommu.h>
 
 #define IOMMU_READ	(1 << 0)
@@ -93,6 +94,8 @@  enum iommu_attr {
  * @detach_dev: detach device from an iommu domain
  * @map: map a physically contiguous memory region to an iommu domain
  * @unmap: unmap a physically contiguous memory region from an iommu domain
+ * @map_range: map a scatter-gather list of physically contiguous memory chunks to an iommu domain
+ * @unmap_range: unmap a scatter-gather list of physically contiguous memory chunks from an iommu domain
  * @iova_to_phys: translate iova to physical address
  * @domain_has_cap: domain capabilities query
  * @add_device: add device to iommu grouping
@@ -110,6 +113,10 @@  struct iommu_ops {
 		   phys_addr_t paddr, size_t size, int prot);
 	size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
 		     size_t size);
+	int (*map_range)(struct iommu_domain *domain, unsigned int iova,
+		    struct scatterlist *sg, unsigned int len, int opt);
+	int (*unmap_range)(struct iommu_domain *domain, unsigned int iova,
+		      unsigned int len, int opt);
 	phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, dma_addr_t iova);
 	int (*domain_has_cap)(struct iommu_domain *domain,
 			      unsigned long cap);
@@ -153,6 +160,10 @@  extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
 		     phys_addr_t paddr, size_t size, int prot);
 extern size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova,
 		       size_t size);
+extern int iommu_map_range(struct iommu_domain *domain, unsigned int iova,
+		    struct scatterlist *sg, unsigned int len, int opt);
+extern int iommu_unmap_range(struct iommu_domain *domain, unsigned int iova,
+		      unsigned int len, int opt);
 extern phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova);
 extern int iommu_domain_has_cap(struct iommu_domain *domain,
 				unsigned long cap);
@@ -287,6 +298,20 @@  static inline int iommu_unmap(struct iommu_domain *domain, unsigned long iova,
 	return -ENODEV;
 }
 
+static inline int iommu_map_range(struct iommu_domain *domain,
+				  unsigned int iova, struct scatterlist *sg,
+				  unsigned int len, int opt)
+{
+	return -ENODEV;
+}
+
+static inline int iommu_unmap_range(struct iommu_domain *domain,
+				    unsigned int iova,
+				    unsigned int len, int opt)
+{
+	return -ENODEV;
+}
+
 static inline int iommu_domain_window_enable(struct iommu_domain *domain,
 					     u32 wnd_nr, phys_addr_t paddr,
 					     u64 size, int prot)