diff mbox

[v3,1/1] iommu-api: Add map_sg/unmap_sg functions

Message ID 1406572731-6216-2-git-send-email-ohaugan@codeaurora.org (mailing list archive)
State New, archived
Headers show

Commit Message

Olav Haugan July 28, 2014, 6:38 p.m. UTC
Mapping and unmapping are more often than not in the critical path.
map_sg and unmap_sg allows IOMMU driver implementations to optimize
the process of mapping and unmapping buffers into the IOMMU page tables.

Instead of mapping a buffer one page at a time and requiring potentially
expensive TLB operations for each page, this function allows the driver
to map all pages in one go and defer TLB maintenance until after all
pages have been mapped.

Additionally, the mapping operation would be faster in general since
clients does not have to keep calling map API over and over again for
each physically contiguous chunk of memory that needs to be mapped to a
virtually contiguous region.

Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
---
 drivers/iommu/iommu.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/iommu.h | 28 ++++++++++++++++++++++++++++
 2 files changed, 76 insertions(+)

Comments

Will Deacon July 28, 2014, 7:11 p.m. UTC | #1
Hi Olav,

On Mon, Jul 28, 2014 at 07:38:51PM +0100, Olav Haugan wrote:
> Mapping and unmapping are more often than not in the critical path.
> map_sg and unmap_sg allows IOMMU driver implementations to optimize
> the process of mapping and unmapping buffers into the IOMMU page tables.
> 
> Instead of mapping a buffer one page at a time and requiring potentially
> expensive TLB operations for each page, this function allows the driver
> to map all pages in one go and defer TLB maintenance until after all
> pages have been mapped.
> 
> Additionally, the mapping operation would be faster in general since
> clients does not have to keep calling map API over and over again for
> each physically contiguous chunk of memory that needs to be mapped to a
> virtually contiguous region.
> 
> Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
> ---
>  drivers/iommu/iommu.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/iommu.h | 28 ++++++++++++++++++++++++++++
>  2 files changed, 76 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 1698360..cd65511 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1088,6 +1088,54 @@ size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
>  }
>  EXPORT_SYMBOL_GPL(iommu_unmap);
>  
> +int iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
> +			struct scatterlist *sg, unsigned int nents,
> +			int prot, unsigned long flags)
> +{
> +	int ret = 0;
> +	unsigned long offset = 0;
> +
> +	BUG_ON(iova & (~PAGE_MASK));
> +
> +	if (unlikely(domain->ops->map_sg == NULL)) {
> +		unsigned int i;
> +		struct scatterlist *s;
> +
> +		for_each_sg(sg, s, nents, i) {
> +			phys_addr_t phys = page_to_phys(sg_page(s));
> +			u32 page_len = PAGE_ALIGN(s->offset + s->length);

Hmm, this is a pretty horrible place where CPU page size (from the sg list)
meets the IOMMU and I think we need to do something better to avoid spurious
failures. In other words, the sg list should be iterated in such a way that
we always pass a multiple of a supported iommu page size to iommu_map.

All the code using PAGE_MASK and PAGE_ALIGN needn't match what is supported
by the IOMMU hardware.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Olav Haugan July 29, 2014, 12:50 a.m. UTC | #2
Hi Will,

On 7/28/2014 12:11 PM, Will Deacon wrote:
> Hi Olav,
> 
> On Mon, Jul 28, 2014 at 07:38:51PM +0100, Olav Haugan wrote:
>> Mapping and unmapping are more often than not in the critical path.
>> map_sg and unmap_sg allows IOMMU driver implementations to optimize
>> the process of mapping and unmapping buffers into the IOMMU page tables.
>>
>> Instead of mapping a buffer one page at a time and requiring potentially
>> expensive TLB operations for each page, this function allows the driver
>> to map all pages in one go and defer TLB maintenance until after all
>> pages have been mapped.
>>
>> Additionally, the mapping operation would be faster in general since
>> clients does not have to keep calling map API over and over again for
>> each physically contiguous chunk of memory that needs to be mapped to a
>> virtually contiguous region.
>>
>> Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
>> ---
>>  drivers/iommu/iommu.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  include/linux/iommu.h | 28 ++++++++++++++++++++++++++++
>>  2 files changed, 76 insertions(+)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 1698360..cd65511 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -1088,6 +1088,54 @@ size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
>>  }
>>  EXPORT_SYMBOL_GPL(iommu_unmap);
>>  
>> +int iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
>> +			struct scatterlist *sg, unsigned int nents,
>> +			int prot, unsigned long flags)
>> +{
>> +	int ret = 0;
>> +	unsigned long offset = 0;
>> +
>> +	BUG_ON(iova & (~PAGE_MASK));
>> +
>> +	if (unlikely(domain->ops->map_sg == NULL)) {
>> +		unsigned int i;
>> +		struct scatterlist *s;
>> +
>> +		for_each_sg(sg, s, nents, i) {
>> +			phys_addr_t phys = page_to_phys(sg_page(s));
>> +			u32 page_len = PAGE_ALIGN(s->offset + s->length);
> 
> Hmm, this is a pretty horrible place where CPU page size (from the sg list)
> meets the IOMMU and I think we need to do something better to avoid spurious
> failures. In other words, the sg list should be iterated in such a way that
> we always pass a multiple of a supported iommu page size to iommu_map.
> 
> All the code using PAGE_MASK and PAGE_ALIGN needn't match what is supported
> by the IOMMU hardware.

I am not sure what you mean. How can we iterate over the sg list in a
different way to ensure we pass a multiple of a supported iommu page
size? Each entry in the sg list are physically discontinuous from each
other. If the page is too big iommu_map will take care of it for us. It
already finds the biggest supported page size and splits up the calls to
domain->ops->map(). Also, whoever allocates memory for use by IOMMU
needs to be aware of what the supported minimum size is or else they
would get mapping failures anyway.

(The code in __map_sg_chunk in arch/arm/mm/dma-mapping.c does the same
thing btw.)

Thanks,

Olav
Will Deacon July 29, 2014, 9:25 a.m. UTC | #3
Hi Olav,

On Tue, Jul 29, 2014 at 01:50:08AM +0100, Olav Haugan wrote:
> On 7/28/2014 12:11 PM, Will Deacon wrote:
> > On Mon, Jul 28, 2014 at 07:38:51PM +0100, Olav Haugan wrote:
> >> +int iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
> >> +			struct scatterlist *sg, unsigned int nents,
> >> +			int prot, unsigned long flags)
> >> +{
> >> +	int ret = 0;
> >> +	unsigned long offset = 0;
> >> +
> >> +	BUG_ON(iova & (~PAGE_MASK));
> >> +
> >> +	if (unlikely(domain->ops->map_sg == NULL)) {
> >> +		unsigned int i;
> >> +		struct scatterlist *s;
> >> +
> >> +		for_each_sg(sg, s, nents, i) {
> >> +			phys_addr_t phys = page_to_phys(sg_page(s));
> >> +			u32 page_len = PAGE_ALIGN(s->offset + s->length);
> > 
> > Hmm, this is a pretty horrible place where CPU page size (from the sg list)
> > meets the IOMMU and I think we need to do something better to avoid spurious
> > failures. In other words, the sg list should be iterated in such a way that
> > we always pass a multiple of a supported iommu page size to iommu_map.
> > 
> > All the code using PAGE_MASK and PAGE_ALIGN needn't match what is supported
> > by the IOMMU hardware.
> 
> I am not sure what you mean. How can we iterate over the sg list in a
> different way to ensure we pass a multiple of a supported iommu page
> size? Each entry in the sg list are physically discontinuous from each
> other. If the page is too big iommu_map will take care of it for us. It
> already finds the biggest supported page size and splits up the calls to
> domain->ops->map(). Also, whoever allocates memory for use by IOMMU
> needs to be aware of what the supported minimum size is or else they
> would get mapping failures anyway.

I agree that we can't handle IOMMUs that have a minimum page size larger
than the CPU page size, but we should be able to handle the case where the
maximum supported page size on the IOMMU is smaller than the CPU page size
(e.g. 4k IOMMU with 64k pages on the CPU). I think that could trip a BUG_ON
with your patch, although the alignment would be ok in iommu_map because
page sizes are always a power-of-2. You also end up rounding the size to
64k, which could lead to mapping more than you really need to.

> (The code in __map_sg_chunk in arch/arm/mm/dma-mapping.c does the same
> thing btw.)

I have the same objection to that code :)

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Olav Haugan July 29, 2014, 5:21 p.m. UTC | #4
On 7/29/2014 2:25 AM, Will Deacon wrote:
> Hi Olav,
> 
> On Tue, Jul 29, 2014 at 01:50:08AM +0100, Olav Haugan wrote:
>> On 7/28/2014 12:11 PM, Will Deacon wrote:
>>> On Mon, Jul 28, 2014 at 07:38:51PM +0100, Olav Haugan wrote:
>>>> +int iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
>>>> +			struct scatterlist *sg, unsigned int nents,
>>>> +			int prot, unsigned long flags)
>>>> +{
>>>> +	int ret = 0;
>>>> +	unsigned long offset = 0;
>>>> +
>>>> +	BUG_ON(iova & (~PAGE_MASK));
>>>> +
>>>> +	if (unlikely(domain->ops->map_sg == NULL)) {
>>>> +		unsigned int i;
>>>> +		struct scatterlist *s;
>>>> +
>>>> +		for_each_sg(sg, s, nents, i) {
>>>> +			phys_addr_t phys = page_to_phys(sg_page(s));
>>>> +			u32 page_len = PAGE_ALIGN(s->offset + s->length);
>>>
>>> Hmm, this is a pretty horrible place where CPU page size (from the sg list)
>>> meets the IOMMU and I think we need to do something better to avoid spurious
>>> failures. In other words, the sg list should be iterated in such a way that
>>> we always pass a multiple of a supported iommu page size to iommu_map.
>>>
>>> All the code using PAGE_MASK and PAGE_ALIGN needn't match what is supported
>>> by the IOMMU hardware.
>>
>> I am not sure what you mean. How can we iterate over the sg list in a
>> different way to ensure we pass a multiple of a supported iommu page
>> size? Each entry in the sg list are physically discontinuous from each
>> other. If the page is too big iommu_map will take care of it for us. It
>> already finds the biggest supported page size and splits up the calls to
>> domain->ops->map(). Also, whoever allocates memory for use by IOMMU
>> needs to be aware of what the supported minimum size is or else they
>> would get mapping failures anyway.
> 
> I agree that we can't handle IOMMUs that have a minimum page size larger
> than the CPU page size, but we should be able to handle the case where the
> maximum supported page size on the IOMMU is smaller than the CPU page size
> (e.g. 4k IOMMU with 64k pages on the CPU). I think that could trip a BUG_ON
> with your patch, although the alignment would be ok in iommu_map because
> page sizes are always a power-of-2. You also end up rounding the size to
> 64k, which could lead to mapping more than you really need to.

Which BUG_ON would I trip? If the supported IOMMU page size is less than
the CPU supported page size then iommu_map will nicely take care of
splitting up the mapping calls into sizes supported by the IOMMU (taken
care of by iommu_pgsize()). However, I see you point regarding the
PAGE_ALIGN of the offset+length that can cause overmapping which you
don't really want. What is the alternative here? Just leave it and do
not align at all? That is how iommu_map() currently works. It will
return error if the iova|phys|size is not aligned to the minimum pgsize
supported by the IOMMU. So I would not change the behavior if I just
left it without trying to align.

I will remove the BUG_ON for (iova & (~PAGE_MASK)).

>> (The code in __map_sg_chunk in arch/arm/mm/dma-mapping.c does the same
>> thing btw.)
> 
> I have the same objection to that code :)

I am hoping we can remove/simplify some of that code when we have the
iommmu_map_sg API available....

Thanks,

Olav
Will Deacon July 30, 2014, 9:45 a.m. UTC | #5
On Tue, Jul 29, 2014 at 06:21:48PM +0100, Olav Haugan wrote:
> On 7/29/2014 2:25 AM, Will Deacon wrote:
> > I agree that we can't handle IOMMUs that have a minimum page size larger
> > than the CPU page size, but we should be able to handle the case where the
> > maximum supported page size on the IOMMU is smaller than the CPU page size
> > (e.g. 4k IOMMU with 64k pages on the CPU). I think that could trip a BUG_ON
> > with your patch, although the alignment would be ok in iommu_map because
> > page sizes are always a power-of-2. You also end up rounding the size to
> > 64k, which could lead to mapping more than you really need to.
> 
> Which BUG_ON would I trip? If the supported IOMMU page size is less than
> the CPU supported page size then iommu_map will nicely take care of
> splitting up the mapping calls into sizes supported by the IOMMU (taken
> care of by iommu_pgsize()). However, I see you point regarding the
> PAGE_ALIGN of the offset+length that can cause overmapping which you
> don't really want. What is the alternative here? Just leave it and do
> not align at all? That is how iommu_map() currently works. It will
> return error if the iova|phys|size is not aligned to the minimum pgsize
> supported by the IOMMU. So I would not change the behavior if I just
> left it without trying to align.

Yeah, I think losing the align is probably the best bet for now.

> I will remove the BUG_ON for (iova & (~PAGE_MASK)).

Great, that's the BUG_ON I was referring to above.

> >> (The code in __map_sg_chunk in arch/arm/mm/dma-mapping.c does the same
> >> thing btw.)
> > 
> > I have the same objection to that code :)
> 
> I am hoping we can remove/simplify some of that code when we have the
> iommmu_map_sg API available....

Looking forward to it!

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 1698360..cd65511 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1088,6 +1088,54 @@  size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
 }
 EXPORT_SYMBOL_GPL(iommu_unmap);
 
+int iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
+			struct scatterlist *sg, unsigned int nents,
+			int prot, unsigned long flags)
+{
+	int ret = 0;
+	unsigned long offset = 0;
+
+	BUG_ON(iova & (~PAGE_MASK));
+
+	if (unlikely(domain->ops->map_sg == NULL)) {
+		unsigned int i;
+		struct scatterlist *s;
+
+		for_each_sg(sg, s, nents, i) {
+			phys_addr_t phys = page_to_phys(sg_page(s));
+			u32 page_len = PAGE_ALIGN(s->offset + s->length);
+
+			ret = iommu_map(domain, iova + offset, phys, page_len,
+					prot);
+			if (ret)
+				goto fail;
+
+			offset += page_len;
+		}
+	} else {
+		ret = domain->ops->map_sg(domain, iova, sg, nents, prot, flags);
+	}
+	goto out;
+
+fail:
+	/* undo mappings already done in case of error */
+	iommu_unmap(domain, iova, offset);
+out:
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_map_sg);
+
+int iommu_unmap_sg(struct iommu_domain *domain, unsigned long iova,
+			size_t size, unsigned long flags)
+{
+	BUG_ON(iova & (~PAGE_MASK));
+
+	if (unlikely(domain->ops->unmap_sg == NULL))
+		return iommu_unmap(domain, iova, size);
+	else
+		return domain->ops->unmap_sg(domain, iova, size, flags);
+}
+EXPORT_SYMBOL_GPL(iommu_unmap_sg);
 
 int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr,
 			       phys_addr_t paddr, u64 size, int prot)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 20f9a52..66ad543 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -22,6 +22,7 @@ 
 #include <linux/errno.h>
 #include <linux/err.h>
 #include <linux/types.h>
+#include <linux/scatterlist.h>
 #include <trace/events/iommu.h>
 
 #define IOMMU_READ	(1 << 0)
@@ -93,6 +94,10 @@  enum iommu_attr {
  * @detach_dev: detach device from an iommu domain
  * @map: map a physically contiguous memory region to an iommu domain
  * @unmap: unmap a physically contiguous memory region from an iommu domain
+ * @map_sg: map a scatter-gather list of physically contiguous memory chunks
+ * to an iommu domain
+ * @unmap_sg: unmap a scatter-gather list of physically contiguous memory
+ * chunks from an iommu domain
  * @iova_to_phys: translate iova to physical address
  * @domain_has_cap: domain capabilities query
  * @add_device: add device to iommu grouping
@@ -110,6 +115,11 @@  struct iommu_ops {
 		   phys_addr_t paddr, size_t size, int prot);
 	size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
 		     size_t size);
+	int (*map_sg)(struct iommu_domain *domain, unsigned long iova,
+			struct scatterlist *sg, unsigned int nents, int prot,
+			unsigned long flags);
+	int (*unmap_sg)(struct iommu_domain *domain, unsigned long iova,
+			size_t size, unsigned long flags);
 	phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, dma_addr_t iova);
 	int (*domain_has_cap)(struct iommu_domain *domain,
 			      unsigned long cap);
@@ -153,6 +163,11 @@  extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
 		     phys_addr_t paddr, size_t size, int prot);
 extern size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova,
 		       size_t size);
+extern int iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
+			struct scatterlist *sg, unsigned int nents, int prot,
+			unsigned long flags);
+extern int iommu_unmap_sg(struct iommu_domain *domain, unsigned long iova,
+				size_t size, unsigned long flags);
 extern phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova);
 extern int iommu_domain_has_cap(struct iommu_domain *domain,
 				unsigned long cap);
@@ -287,6 +302,19 @@  static inline int iommu_unmap(struct iommu_domain *domain, unsigned long iova,
 	return -ENODEV;
 }
 
+static inline int iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
+			struct scatterlist *sg, unsigned int nents, int prot,
+			unsigned long flags)
+{
+	return -ENODEV;
+}
+
+static inline int iommu_unmap_sg(struct iommu_domain *domain,
+			unsigned long iova, size_t size, unsigned long flags)
+{
+	return -ENODEV;
+}
+
 static inline int iommu_domain_window_enable(struct iommu_domain *domain,
 					     u32 wnd_nr, phys_addr_t paddr,
 					     u64 size, int prot)