diff mbox

[5/6] IB core: Fix ib_sg_to_pages()

Message ID 565FA75E.7010100@sandisk.com (mailing list archive)
State Superseded
Headers show

Commit Message

Bart Van Assche Dec. 3, 2015, 2:22 a.m. UTC
On 12/02/2015 01:31 AM, Sagi Grimberg wrote:
> On 01/12/2015 21:10, Bart Van Assche wrote:
>> On 12/01/2015 10:32 AM, Sagi Grimberg wrote:
>> How ib_sg_to_pages() reports to its caller that mapping the first
>> scatterlist element failed is not important to me. I included that
>> change in this patch because I noticed that in the upstream SRP
>> initiator does not consider ib_map_mr_sg() returning zero as an error. I
>> think either ib_sg_to_pages() or ib_map_mr_sg() should be modified such
>> that "no pages have been mapped" is handled as a mapping failure.
>
> I don't either, but the patch introduces inconsistency in a case where
> the caller passed sg_nents=0 (which will return 0 and not -errno).
>
> Would it make better sense to have srp treat 0 return as error? note
> that all other ULPs treat return_val != sg_nents as error.

Hello Sagi,

Hmm ... why would it be unacceptable to return 0 if sg_nents == 0 ?

Regarding which component to modify if mapping the first page fails:
for almost every kernel function I know a negative return value means
failure and a return value >= 0 means success. Hence my proposal to
change the return value of the ib_map_mr_sg() function if mapping the
first page fails.

How about the patch below ?

Bart.



[PATCH] IB core: Fix ib_sg_to_pages()

Fix the code for detecting gaps. A gap occurs not only if the
second or later scatterlist element is not aligned but also if
any scatterlist element other than the last does not end at a
page boundary. Ensure that this function returns a negative
error code instead of zero if the first set_page() call fails.

Fixes: commit 4c67e2bfc8b7 ("IB/core: Introduce new fast registration API")
Reported-by: Christoph Hellwig <hch@lst.de>
---
 drivers/infiniband/core/verbs.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

Comments

Sagi Grimberg Dec. 3, 2015, 9:07 a.m. UTC | #1
> Hello Sagi,
>
> Hmm ... why would it be unacceptable to return 0 if sg_nents == 0 ?
>
> Regarding which component to modify if mapping the first page fails:
> for almost every kernel function I know a negative return value means
> failure and a return value >= 0 means success. Hence my proposal to
> change the return value of the ib_map_mr_sg() function if mapping the
> first page fails.

I'm fine with that.

>
> How about the patch below ?

Looks fine.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig Dec. 3, 2015, 9:18 a.m. UTC | #2
> How about the patch below ?

The patch looks good to me, but while we touch this area, how about
throwing in a few cosmetic fixes as well?

> -		if (i && page_addr != dma_addr) {
> +		if (i && (page_addr != dma_addr || last_page_off != 0)) {
>  			if (last_end_dma_addr != dma_addr) {

Wo about we one or two sentences for each of the conditions here?

>  				/* gap */
> -				goto done;
> -
> +				break;
>  			} else if (last_page_off + dma_len <= mr->page_size) {
>  				/* chunk this fragment with the last */
>  				mr->length += dma_len;

It would be great to avoid the else clauses if we already to a
break/continue/goto to make the code flow more clear, e.g.


			/*
			 * Gap to the previous segment, we'll need to return
			 * and use another FR to map the reminder.
			 */
			if (last_end_dma_addr != dma_addr)
				break;

			/*
			 * See if this segment is contiguous to the
			 * previous one and just merge it in that case.
			 */
			if (last_page_off + dma_len <= mr->page_size) {
				last_end_dma_addr += dma_len;
				last_page_off += dma_len;
				mr->length += dma_len;
				continue;
			}

			/*
			 * New page-aligned segment to map:
			 */
			page_addr = last_page_addr + mr->page_size;
			dma_len -= mr->page_size - last_page_off;


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 043a60e..1972c50 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1516,7 +1516,7 @@  EXPORT_SYMBOL(ib_map_mr_sg);
  * @sg_nents:      number of entries in sg
  * @set_page:      driver page assignment function pointer
  *
- * Core service helper for drivers to covert the largest
+ * Core service helper for drivers to convert the largest
  * prefix of given sg list to a page vector. The sg list
  * prefix converted is the prefix that meet the requirements
  * of ib_map_mr_sg.
@@ -1533,7 +1533,7 @@  int ib_sg_to_pages(struct ib_mr *mr,
 	u64 last_end_dma_addr = 0, last_page_addr = 0;
 	unsigned int last_page_off = 0;
 	u64 page_mask = ~((u64)mr->page_size - 1);
-	int i;
+	int i, ret;
 
 	mr->iova = sg_dma_address(&sgl[0]);
 	mr->length = 0;
@@ -1544,11 +1544,10 @@  int ib_sg_to_pages(struct ib_mr *mr,
 		u64 end_dma_addr = dma_addr + dma_len;
 		u64 page_addr = dma_addr & page_mask;
 
-		if (i && page_addr != dma_addr) {
+		if (i && (page_addr != dma_addr || last_page_off != 0)) {
 			if (last_end_dma_addr != dma_addr) {
 				/* gap */
-				goto done;
-
+				break;
 			} else if (last_page_off + dma_len <= mr->page_size) {
 				/* chunk this fragment with the last */
 				mr->length += dma_len;
@@ -1563,8 +1562,9 @@  int ib_sg_to_pages(struct ib_mr *mr,
 		}
 
 		do {
-			if (unlikely(set_page(mr, page_addr)))
-				goto done;
+			ret = set_page(mr, page_addr);
+			if (unlikely(ret < 0))
+				return i ? : ret;
 			page_addr += mr->page_size;
 		} while (page_addr < end_dma_addr);
 
@@ -1574,7 +1574,6 @@  int ib_sg_to_pages(struct ib_mr *mr,
 		last_page_off = end_dma_addr & ~page_mask;
 	}
 
-done:
 	return i;
 }
 EXPORT_SYMBOL(ib_sg_to_pages);