diff mbox

[v2,1/1] block: blk-merge: don't merge the pages with non-contiguous descriptors

Message ID 20130117104741.GY23505@n2100.arm.linux.org.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Russell King - ARM Linux Jan. 17, 2013, 10:47 a.m. UTC
On Thu, Jan 17, 2013 at 10:37:42AM +0000, Russell King - ARM Linux wrote:
> On Thu, Jan 17, 2013 at 09:11:20AM +0000, James Bottomley wrote:
> > I'd actually prefer page = pfn_to_page(page_to_pfn(page) + 1); because
> > it makes the code look like the hack it is.  The preferred form for all
> > iterators like this should be to iterate over the pfn instead of a
> > pointer into the page arrays, because that will always work correctly no
> > matter how many weird and wonderful memory schemes we come up with.
> 
> So, why don't we update the code to do that then?

Also, couldn't the addition of the scatterlist offset to the page also
be buggy too?

So, what about this patch which addresses both additions by keeping our
iterator as a pfn as you suggest.  It also simplifies some of the code
in the loop too.

Can the original folk with the problem test this patch?

 arch/arm/mm/dma-mapping.c |   18 ++++++++++--------
 1 files changed, 10 insertions(+), 8 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

James Bottomley Jan. 17, 2013, 11:01 a.m. UTC | #1
On Thu, 2013-01-17 at 10:47 +0000, Russell King - ARM Linux wrote:
> On Thu, Jan 17, 2013 at 10:37:42AM +0000, Russell King - ARM Linux wrote:
> > On Thu, Jan 17, 2013 at 09:11:20AM +0000, James Bottomley wrote:
> > > I'd actually prefer page = pfn_to_page(page_to_pfn(page) + 1); because
> > > it makes the code look like the hack it is.  The preferred form for all
> > > iterators like this should be to iterate over the pfn instead of a
> > > pointer into the page arrays, because that will always work correctly no
> > > matter how many weird and wonderful memory schemes we come up with.
> > 
> > So, why don't we update the code to do that then?

We can, but it involves quite a rewrite within the arm dma-mapping code
to use pfn instead of page.  It looks like it would make the code
cleaner because there are a lot of page_to_pfn transformations in there.
However, the current patch is the simplest one for stable and I don't
actually have any arm build and test environments.

> Also, couldn't the addition of the scatterlist offset to the page also
> be buggy too?

No, fortunately, offset must be within the first page from the point of
view of block generated sg lists.  As long as nothing within arm
violates this, it should be a safe assumption ... although the code
seems to assume otherwise.

James

> So, what about this patch which addresses both additions by keeping our
> iterator as a pfn as you suggest.  It also simplifies some of the code
> in the loop too.
> 
> Can the original folk with the problem test this patch?
> 
>  arch/arm/mm/dma-mapping.c |   18 ++++++++++--------
>  1 files changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index 6b2fb87..076c26d 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -774,25 +774,27 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset,
>  	size_t size, enum dma_data_direction dir,
>  	void (*op)(const void *, size_t, int))
>  {
> +	unsigned long pfn;
> +	size_t left = size;
> +
> +	pfn = page_to_pfn(page) + offset / PAGE_SIZE;
> +	offset %= PAGE_SIZE;
> +
>  	/*
>  	 * A single sg entry may refer to multiple physically contiguous
>  	 * pages.  But we still need to process highmem pages individually.
>  	 * If highmem is not configured then the bulk of this loop gets
>  	 * optimized out.
>  	 */
> -	size_t left = size;
>  	do {
>  		size_t len = left;
>  		void *vaddr;
>  
> +		page = pfn_to_page(pfn);
> +
>  		if (PageHighMem(page)) {
> -			if (len + offset > PAGE_SIZE) {
> -				if (offset >= PAGE_SIZE) {
> -					page += offset / PAGE_SIZE;
> -					offset %= PAGE_SIZE;
> -				}
> +			if (len + offset > PAGE_SIZE)
>  				len = PAGE_SIZE - offset;
> -			}
>  			vaddr = kmap_high_get(page);
>  			if (vaddr) {
>  				vaddr += offset;
> @@ -809,7 +811,7 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset,
>  			op(vaddr, len, dir);
>  		}
>  		offset = 0;
> -		page++;
> +		pfn++;
>  		left -= len;
>  	} while (left);
>  }

Looks reasonable modulo all the simplification we could do if we can
assume offset < PAGE_SIZE

James


--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Russell King - ARM Linux Jan. 17, 2013, 11:04 a.m. UTC | #2
On Thu, Jan 17, 2013 at 11:01:47AM +0000, James Bottomley wrote:
> On Thu, 2013-01-17 at 10:47 +0000, Russell King - ARM Linux wrote:
> > Also, couldn't the addition of the scatterlist offset to the page also
> > be buggy too?
> 
> No, fortunately, offset must be within the first page from the point of
> view of block generated sg lists.  As long as nothing within arm
> violates this, it should be a safe assumption ... although the code
> seems to assume otherwise.

Are you absolutely sure about that?  I believe I have seen cases where
that has been violated in the past, though it was many years ago.
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley Jan. 17, 2013, 11:19 a.m. UTC | #3
On Thu, 2013-01-17 at 11:04 +0000, Russell King - ARM Linux wrote:
> On Thu, Jan 17, 2013 at 11:01:47AM +0000, James Bottomley wrote:
> > On Thu, 2013-01-17 at 10:47 +0000, Russell King - ARM Linux wrote:
> > > Also, couldn't the addition of the scatterlist offset to the page also
> > > be buggy too?
> > 
> > No, fortunately, offset must be within the first page from the point of
> > view of block generated sg lists.  As long as nothing within arm
> > violates this, it should be a safe assumption ... although the code
> > seems to assume otherwise.
> 
> Are you absolutely sure about that?  I believe I have seen cases where
> that has been violated in the past, though it was many years ago.

From the point of view of the block layer, absolutely: the scatterlist
is generated from an array of bio_vecs.  Each bio_vec is a page, offset
and length element and obeys the rule that offset must be within the
page and offset + length cannot stray over the page.

From the point of view of other arm stuff, I don't know.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Russell King - ARM Linux Jan. 17, 2013, 11:40 a.m. UTC | #4
On Thu, Jan 17, 2013 at 11:19:21AM +0000, James Bottomley wrote:
> On Thu, 2013-01-17 at 11:04 +0000, Russell King - ARM Linux wrote:
> > On Thu, Jan 17, 2013 at 11:01:47AM +0000, James Bottomley wrote:
> > > On Thu, 2013-01-17 at 10:47 +0000, Russell King - ARM Linux wrote:
> > > > Also, couldn't the addition of the scatterlist offset to the page also
> > > > be buggy too?
> > > 
> > > No, fortunately, offset must be within the first page from the point of
> > > view of block generated sg lists.  As long as nothing within arm
> > > violates this, it should be a safe assumption ... although the code
> > > seems to assume otherwise.
> > 
> > Are you absolutely sure about that?  I believe I have seen cases where
> > that has been violated in the past, though it was many years ago.
> 
> >From the point of view of the block layer, absolutely: the scatterlist
> is generated from an array of bio_vecs.  Each bio_vec is a page, offset
> and length element and obeys the rule that offset must be within the
> page and offset + length cannot stray over the page.

Well, I found it when working on the mmc stuff initially, long before
it got complex.  The scatterlists were unmodified from the block layer,
and I'm positive I saw occasions where the offset in the scatter lists
were larger than PAGE_SIZE.

> >From the point of view of other arm stuff, I don't know.

I'm not talking about anything ARM specific here.
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
subhashj@codeaurora.org Jan. 17, 2013, 2:58 p.m. UTC | #5
On 1/17/2013 4:17 PM, Russell King - ARM Linux wrote:
> On Thu, Jan 17, 2013 at 10:37:42AM +0000, Russell King - ARM Linux wrote:
>> On Thu, Jan 17, 2013 at 09:11:20AM +0000, James Bottomley wrote:
>>> I'd actually prefer page = pfn_to_page(page_to_pfn(page) + 1); because
>>> it makes the code look like the hack it is.  The preferred form for all
>>> iterators like this should be to iterate over the pfn instead of a
>>> pointer into the page arrays, because that will always work correctly no
>>> matter how many weird and wonderful memory schemes we come up with.
>> So, why don't we update the code to do that then?
> Also, couldn't the addition of the scatterlist offset to the page also
> be buggy too?
>
> So, what about this patch which addresses both additions by keeping our
> iterator as a pfn as you suggest.  It also simplifies some of the code
> in the loop too.
>
> Can the original folk with the problem test this patch?

Yes, this patch also fixes the issue.  You may add: Tested-by: Subhash 
Jadavani <subhashj@codeaurora.org> .

Regards,
Subhash

>
>   arch/arm/mm/dma-mapping.c |   18 ++++++++++--------
>   1 files changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index 6b2fb87..076c26d 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -774,25 +774,27 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset,
>   	size_t size, enum dma_data_direction dir,
>   	void (*op)(const void *, size_t, int))
>   {
> +	unsigned long pfn;
> +	size_t left = size;
> +
> +	pfn = page_to_pfn(page) + offset / PAGE_SIZE;
> +	offset %= PAGE_SIZE;
> +
>   	/*
>   	 * A single sg entry may refer to multiple physically contiguous
>   	 * pages.  But we still need to process highmem pages individually.
>   	 * If highmem is not configured then the bulk of this loop gets
>   	 * optimized out.
>   	 */
> -	size_t left = size;
>   	do {
>   		size_t len = left;
>   		void *vaddr;
>   
> +		page = pfn_to_page(pfn);
> +
>   		if (PageHighMem(page)) {
> -			if (len + offset > PAGE_SIZE) {
> -				if (offset >= PAGE_SIZE) {
> -					page += offset / PAGE_SIZE;
> -					offset %= PAGE_SIZE;
> -				}
> +			if (len + offset > PAGE_SIZE)
>   				len = PAGE_SIZE - offset;
> -			}
>   			vaddr = kmap_high_get(page);
>   			if (vaddr) {
>   				vaddr += offset;
> @@ -809,7 +811,7 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset,
>   			op(vaddr, len, dir);
>   		}
>   		offset = 0;
> -		page++;
> +		pfn++;
>   		left -= len;
>   	} while (left);
>   }
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 6b2fb87..076c26d 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -774,25 +774,27 @@  static void dma_cache_maint_page(struct page *page, unsigned long offset,
 	size_t size, enum dma_data_direction dir,
 	void (*op)(const void *, size_t, int))
 {
+	unsigned long pfn;
+	size_t left = size;
+
+	pfn = page_to_pfn(page) + offset / PAGE_SIZE;
+	offset %= PAGE_SIZE;
+
 	/*
 	 * A single sg entry may refer to multiple physically contiguous
 	 * pages.  But we still need to process highmem pages individually.
 	 * If highmem is not configured then the bulk of this loop gets
 	 * optimized out.
 	 */
-	size_t left = size;
 	do {
 		size_t len = left;
 		void *vaddr;
 
+		page = pfn_to_page(pfn);
+
 		if (PageHighMem(page)) {
-			if (len + offset > PAGE_SIZE) {
-				if (offset >= PAGE_SIZE) {
-					page += offset / PAGE_SIZE;
-					offset %= PAGE_SIZE;
-				}
+			if (len + offset > PAGE_SIZE)
 				len = PAGE_SIZE - offset;
-			}
 			vaddr = kmap_high_get(page);
 			if (vaddr) {
 				vaddr += offset;
@@ -809,7 +811,7 @@  static void dma_cache_maint_page(struct page *page, unsigned long offset,
 			op(vaddr, len, dir);
 		}
 		offset = 0;
-		page++;
+		pfn++;
 		left -= len;
 	} while (left);
 }