[v2,01/21] scatterlist: Introduce sg_map helper functions
diff mbox

Message ID 1493144468-22493-2-git-send-email-logang@deltatee.com
State New, archived
Headers show

Commit Message

Logan Gunthorpe April 25, 2017, 6:20 p.m. UTC
This patch introduces functions which kmap the pages inside an sgl.
These functions replace a common pattern of kmap(sg_page(sg)) that is
used in more than 50 places within the kernel.

The motivation for this work is to eventually safely support sgls that
contain io memory. In order for that to work, any access to the contents
of an iomem SGL will need to be done with iomemcpy or hit some warning.
(The exact details of how this will work have yet to be worked out.)
Having all the kmaps in one place is just a first step in that
direction. Additionally, seeing this helps cut down the users of sg_page,
it should make any effort to go to struct-page-less DMAs a little
easier (should that idea ever swing back into favour again).

A flags option is added to select between a regular or atomic mapping so
these functions can replace kmap(sg_page or kmap_atomic(sg_page.
Future work may expand this to have flags for using page_address or
vmap. We include a flag to require the function not to fail to
support legacy code that has no easy error path. Much further in the
future, there may be a flag to allocate memory and copy the data
from/to iomem.

We also add the semantic that sg_map can fail to create a mapping,
despite the fact that the current code this is replacing is assumed to
never fail and the current version of these functions cannot fail. This
is to support iomem which may either have to fail to create the mapping or
allocate memory as a bounce buffer which itself can fail.

Also, in terms of cleanup, a few of the existing kmap(sg_page) users
play things a bit loose in terms of whether they apply sg->offset
so using these helper functions should help avoid such issues.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 include/linux/scatterlist.h | 85 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 85 insertions(+)

Comments

Christoph Hellwig April 26, 2017, 7:44 a.m. UTC | #1
On Tue, Apr 25, 2017 at 12:20:48PM -0600, Logan Gunthorpe wrote:
> This patch introduces functions which kmap the pages inside an sgl.
> These functions replace a common pattern of kmap(sg_page(sg)) that is
> used in more than 50 places within the kernel.
> 
> The motivation for this work is to eventually safely support sgls that
> contain io memory. In order for that to work, any access to the contents
> of an iomem SGL will need to be done with iomemcpy or hit some warning.
> (The exact details of how this will work have yet to be worked out.)

I think we'll at least need a draft of those to make sense of these
patches.  Otherwise they just look very clumsy.

> + *   Use this function to map a page in the scatterlist at the specified
> + *   offset. sg->offset is already added for you. Note: the semantics of
> + *   this function are that it may fail. Thus, its output should be checked
> + *   with IS_ERR and PTR_ERR. Otherwise, a pointer to the specified offset
> + *   in the mapped page is returned.
> + *
> + *   Flags can be any of:
> + *	* SG_KMAP		- Use kmap to create the mapping
> + *	* SG_KMAP_ATOMIC	- Use kmap_atomic to map the page atommically.
> + *				  Thus, the rules of that function apply: the
> + *				  cpu may not sleep until it is unmaped.
> + *	* SG_MAP_MUST_NOT_FAIL	- Indicate that sg_map must not fail.
> + *				  If it does, it will issue a BUG_ON instead.
> + *				  This is intended for legacy code only, it
> + *				  is not to be used in new code.

I'm sorry but this API is just a trainwreck.  Right now we have the
nice little kmap_atomic API, which never fails and has a very nice
calling convention where we just pass back the return address, but does
not support sleeping inside the critical section.

And kmap, whіch may fail and requires the original page to be passed
back.  Anything that mixes these two concepts up is simply a non-starter.
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christian König April 26, 2017, 8:59 a.m. UTC | #2
Am 25.04.2017 um 20:20 schrieb Logan Gunthorpe:
> This patch introduces functions which kmap the pages inside an sgl.
> These functions replace a common pattern of kmap(sg_page(sg)) that is
> used in more than 50 places within the kernel.
>
> The motivation for this work is to eventually safely support sgls that
> contain io memory. In order for that to work, any access to the contents
> of an iomem SGL will need to be done with iomemcpy or hit some warning.
> (The exact details of how this will work have yet to be worked out.)
> Having all the kmaps in one place is just a first step in that
> direction. Additionally, seeing this helps cut down the users of sg_page,
> it should make any effort to go to struct-page-less DMAs a little
> easier (should that idea ever swing back into favour again).
>
> A flags option is added to select between a regular or atomic mapping so
> these functions can replace kmap(sg_page or kmap_atomic(sg_page.
> Future work may expand this to have flags for using page_address or
> vmap. We include a flag to require the function not to fail to
> support legacy code that has no easy error path. Much further in the
> future, there may be a flag to allocate memory and copy the data
> from/to iomem.
>
> We also add the semantic that sg_map can fail to create a mapping,
> despite the fact that the current code this is replacing is assumed to
> never fail and the current version of these functions cannot fail. This
> is to support iomem which may either have to fail to create the mapping or
> allocate memory as a bounce buffer which itself can fail.
>
> Also, in terms of cleanup, a few of the existing kmap(sg_page) users
> play things a bit loose in terms of whether they apply sg->offset
> so using these helper functions should help avoid such issues.
>
> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
> ---

Good to know that somebody is working on this. Those problems troubled 
us as well.

Patch is Acked-by: Christian König <christian.koenig@amd.com>.

Regards,
Christian.

>   include/linux/scatterlist.h | 85 +++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 85 insertions(+)
>
> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
> index cb3c8fe..fad170b 100644
> --- a/include/linux/scatterlist.h
> +++ b/include/linux/scatterlist.h
> @@ -5,6 +5,7 @@
>   #include <linux/types.h>
>   #include <linux/bug.h>
>   #include <linux/mm.h>
> +#include <linux/highmem.h>
>   #include <asm/io.h>
>   
>   struct scatterlist {
> @@ -126,6 +127,90 @@ static inline struct page *sg_page(struct scatterlist *sg)
>   	return (struct page *)((sg)->page_link & ~0x3);
>   }
>   
> +#define SG_KMAP		     (1 << 0)	/* create a mapping with kmap */
> +#define SG_KMAP_ATOMIC	     (1 << 1)	/* create a mapping with kmap_atomic */
> +#define SG_MAP_MUST_NOT_FAIL (1 << 2)	/* indicate sg_map should not fail */
> +
> +/**
> + * sg_map - kmap a page inside an sgl
> + * @sg:		SG entry
> + * @offset:	Offset into entry
> + * @flags:	Flags for creating the mapping
> + *
> + * Description:
> + *   Use this function to map a page in the scatterlist at the specified
> + *   offset. sg->offset is already added for you. Note: the semantics of
> + *   this function are that it may fail. Thus, its output should be checked
> + *   with IS_ERR and PTR_ERR. Otherwise, a pointer to the specified offset
> + *   in the mapped page is returned.
> + *
> + *   Flags can be any of:
> + *	* SG_KMAP		- Use kmap to create the mapping
> + *	* SG_KMAP_ATOMIC	- Use kmap_atomic to map the page atommically.
> + *				  Thus, the rules of that function apply: the
> + *				  cpu may not sleep until it is unmaped.
> + *	* SG_MAP_MUST_NOT_FAIL	- Indicate that sg_map must not fail.
> + *				  If it does, it will issue a BUG_ON instead.
> + *				  This is intended for legacy code only, it
> + *				  is not to be used in new code.
> + *
> + *   Also, consider carefully whether this function is appropriate. It is
> + *   largely not recommended for new code and if the sgl came from another
> + *   subsystem and you don't know what kind of memory might be in the list
> + *   then you definitely should not call it. Non-mappable memory may be in
> + *   the sgl and thus this function may fail unexpectedly. Consider using
> + *   sg_copy_to_buffer instead.
> + **/
> +static inline void *sg_map(struct scatterlist *sg, size_t offset, int flags)
> +{
> +	struct page *pg;
> +	unsigned int pg_off;
> +	void *ret;
> +
> +	offset += sg->offset;
> +	pg = nth_page(sg_page(sg), offset >> PAGE_SHIFT);
> +	pg_off = offset_in_page(offset);
> +
> +	if (flags & SG_KMAP_ATOMIC)
> +		ret = kmap_atomic(pg) + pg_off;
> +	else if (flags & SG_KMAP)
> +		ret = kmap(pg) + pg_off;
> +	else
> +		ret = ERR_PTR(-EINVAL);
> +
> +	/*
> +	 * In theory, this can't happen yet. Once we start adding
> +	 * unmapable memory, it also shouldn't happen unless developers
> +	 * start putting unmappable struct pages in sgls and passing
> +	 * it to code that doesn't support it.
> +	 */
> +	BUG_ON(flags & SG_MAP_MUST_NOT_FAIL && IS_ERR(ret));
> +
> +	return ret;
> +}
> +
> +/**
> + * sg_unmap - unmap a page that was mapped with sg_map_offset
> + * @sg:		SG entry
> + * @addr:	address returned by sg_map_offset
> + * @offset:	Offset into entry (same as specified for sg_map)
> + * @flags:	Flags, which are the same specified for sg_map
> + *
> + * Description:
> + *   Unmap the page that was mapped with sg_map_offset
> + **/
> +static inline void sg_unmap(struct scatterlist *sg, void *addr,
> +			    size_t offset, int flags)
> +{
> +	struct page *pg = nth_page(sg_page(sg), offset >> PAGE_SHIFT);
> +	unsigned int pg_off = offset_in_page(offset);
> +
> +	if (flags & SG_KMAP_ATOMIC)
> +		kunmap_atomic(addr - sg->offset - pg_off);
> +	else if (flags & SG_KMAP)
> +		kunmap(pg);
> +}
> +
>   /**
>    * sg_set_buf - Set sg entry to point at given data
>    * @sg:		 SG entry


--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Logan Gunthorpe April 26, 2017, 6:11 p.m. UTC | #3
On 26/04/17 01:44 AM, Christoph Hellwig wrote:
> I think we'll at least need a draft of those to make sense of these
> patches.  Otherwise they just look very clumsy.

Ok, I'll work up a draft proposal and send it in a couple days. But
without a lot of cleanup such as this series it's not going to even be
able to compile.

> I'm sorry but this API is just a trainwreck.  Right now we have the
> nice little kmap_atomic API, which never fails and has a very nice
> calling convention where we just pass back the return address, but does
> not support sleeping inside the critical section.
> 
> And kmap, whіch may fail and requires the original page to be passed
> back.  Anything that mixes these two concepts up is simply a non-starter.

Ok, well for starters I think you are mistaken about kmap being able to
fail. I'm having a hard time finding many users of that function that
bother to check for an error when calling it. The main difficulty we
have now is that neither of those functions are expected to fail and we
need them to be able to in cases where the page doesn't map to system
RAM. This patch series is trying to address it for users of scatterlist.
I'm certainly open to other suggestions.

I also have to disagree that kmap and kmap_atomic are all that "nice".
Except for the sleeping restriction and performance, they effectively do
the same thing. And it was necessary to write a macro wrapper around
kunmap_atomic to ensure that users of that function don't screw it up.
(See 597781f3e5.) I'd say the kmap/kmap_atomic functions are the
trainwreck and I'm trying to do my best to cleanup a few cases.

There are a fair number of cases in the kernel that do something like:

if (something)
    x = kmap(page);
else
    x = kmap_atomic(page);
...
if (something)
    kunmap(page)
else
    kunmap_atomic(x)

Which just seems cumbersome to me.

In any case, if you can accept an sg_kmap and sg_kmap_atomic api just
say so and I'll make the change. But I'll still need a flags variable
for SG_MAP_MUST_NOT_FAIL to support legacy cases that have no fail path
and both of those functions will need to be pretty nearly replicas of
each other.

Logan


--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Logan Gunthorpe April 26, 2017, 11:30 p.m. UTC | #4
On 26/04/17 02:59 AM,   wrote:
> Good to know that somebody is working on this. Those problems troubled
> us as well.

Thanks Christian. It's a daunting problem and a there's a lot of work to
do before we will ever be where we need to be so any help, even an ack,
is greatly appreciated.

Logan

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig April 27, 2017, 6:53 a.m. UTC | #5
On Wed, Apr 26, 2017 at 12:11:33PM -0600, Logan Gunthorpe wrote:
> Ok, well for starters I think you are mistaken about kmap being able to
> fail. I'm having a hard time finding many users of that function that
> bother to check for an error when calling it.

A quick audit of the arch code shows you're right - kmap can't fail
anywhere anymore.

> The main difficulty we
> have now is that neither of those functions are expected to fail and we
> need them to be able to in cases where the page doesn't map to system
> RAM. This patch series is trying to address it for users of scatterlist.
> I'm certainly open to other suggestions.

I think you'll need to follow the existing kmap semantics and never
fail the iomem version either.  Otherwise you'll have a special case
that's almost never used that has a different error path.

> There are a fair number of cases in the kernel that do something like:
> 
> if (something)
>     x = kmap(page);
> else
>     x = kmap_atomic(page);
> ...
> if (something)
>     kunmap(page)
> else
>     kunmap_atomic(x)
> 
> Which just seems cumbersome to me.

Passing a different flag based on something isn't really much better.

> In any case, if you can accept an sg_kmap and sg_kmap_atomic api just
> say so and I'll make the change. But I'll still need a flags variable
> for SG_MAP_MUST_NOT_FAIL to support legacy cases that have no fail path
> and both of those functions will need to be pretty nearly replicas of
> each other.

Again, wrong way.  Suddenly making things fail for your special case
that normally don't fail is a receipe for bugs.
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe April 27, 2017, 3:27 p.m. UTC | #6
On Thu, Apr 27, 2017 at 08:53:38AM +0200, Christoph Hellwig wrote:

> > The main difficulty we
> > have now is that neither of those functions are expected to fail and we
> > need them to be able to in cases where the page doesn't map to system
> > RAM. This patch series is trying to address it for users of scatterlist.
> > I'm certainly open to other suggestions.
> 
> I think you'll need to follow the existing kmap semantics and never
> fail the iomem version either.  Otherwise you'll have a special case
> that's almost never used that has a different error path.

How about first switching as many call sites as possible to use
sg_copy_X_buffer instead of kmap?

A random audit of Logan's series suggests this is actually a fairly
common thing.

eg drivers/mmc/host/sdhci.c is only doing this:

                                        buffer = sdhci_kmap_atomic(sg, &flags);
                                        memcpy(buffer, align, size);
                                        sdhci_kunmap_atomic(buffer, &flags);

drivers/scsi/mvsas/mv_sas.c is this:

+			to = sg_map(sg_resp, 0, SG_KMAP_ATOMIC);
+			memcpy(to,
+			       slot->response + sizeof(struct mvs_err_info),
+			       sg_dma_len(sg_resp));
+			sg_unmap(sg_resp, to, 0, SG_KMAP_ATOMIC);

etc.

Lots of other places seem similar, if not sometimes a little bit more
convoluted..

Switching all the trivial cases to use copy might bring more clarity
as to what is actually required for the remaining few users? If there
are only a few then it may no longer matter if the API is not idyllic.

Jason
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Logan Gunthorpe April 27, 2017, 3:44 p.m. UTC | #7
On 27/04/17 12:53 AM, Christoph Hellwig wrote:
> I think you'll need to follow the existing kmap semantics and never
> fail the iomem version either.  Otherwise you'll have a special case
> that's almost never used that has a different error path.
>
> Again, wrong way.  Suddenly making things fail for your special case
> that normally don't fail is a receipe for bugs.

I don't disagree but these restrictions make the problem impossible to
solve? If there is iomem behind a page in an SGL and someone tries to
map it, we either have to fail or we break iomem safety which was your
original concern.

Logan

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Logan Gunthorpe April 27, 2017, 3:57 p.m. UTC | #8
On 27/04/17 09:27 AM, Jason Gunthorpe wrote:
> On Thu, Apr 27, 2017 at 08:53:38AM +0200, Christoph Hellwig wrote:
> How about first switching as many call sites as possible to use
> sg_copy_X_buffer instead of kmap?

Yeah, I could look at doing that first.

One problem is we might get more Naks of the form of Herbert Xu's who
might be concerned with the performance implications.

These are definitely a bit more invasive changes than thin wrappers
around kmap calls.

> A random audit of Logan's series suggests this is actually a fairly
> common thing.

It's not _that_ common but there are a significant fraction. One of my
patches actually did this to two places that seemed to be reimplementing
the sg_copy_X_buffer logic.

Thanks,

Logan
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index cb3c8fe..fad170b 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -5,6 +5,7 @@ 
 #include <linux/types.h>
 #include <linux/bug.h>
 #include <linux/mm.h>
+#include <linux/highmem.h>
 #include <asm/io.h>
 
 struct scatterlist {
@@ -126,6 +127,90 @@  static inline struct page *sg_page(struct scatterlist *sg)
 	return (struct page *)((sg)->page_link & ~0x3);
 }
 
+#define SG_KMAP		     (1 << 0)	/* create a mapping with kmap */
+#define SG_KMAP_ATOMIC	     (1 << 1)	/* create a mapping with kmap_atomic */
+#define SG_MAP_MUST_NOT_FAIL (1 << 2)	/* indicate sg_map should not fail */
+
+/**
+ * sg_map - kmap a page inside an sgl
+ * @sg:		SG entry
+ * @offset:	Offset into entry
+ * @flags:	Flags for creating the mapping
+ *
+ * Description:
+ *   Use this function to map a page in the scatterlist at the specified
+ *   offset. sg->offset is already added for you. Note: the semantics of
+ *   this function are that it may fail. Thus, its output should be checked
+ *   with IS_ERR and PTR_ERR. Otherwise, a pointer to the specified offset
+ *   in the mapped page is returned.
+ *
+ *   Flags can be any of:
+ *	* SG_KMAP		- Use kmap to create the mapping
+ *	* SG_KMAP_ATOMIC	- Use kmap_atomic to map the page atommically.
+ *				  Thus, the rules of that function apply: the
+ *				  cpu may not sleep until it is unmaped.
+ *	* SG_MAP_MUST_NOT_FAIL	- Indicate that sg_map must not fail.
+ *				  If it does, it will issue a BUG_ON instead.
+ *				  This is intended for legacy code only, it
+ *				  is not to be used in new code.
+ *
+ *   Also, consider carefully whether this function is appropriate. It is
+ *   largely not recommended for new code and if the sgl came from another
+ *   subsystem and you don't know what kind of memory might be in the list
+ *   then you definitely should not call it. Non-mappable memory may be in
+ *   the sgl and thus this function may fail unexpectedly. Consider using
+ *   sg_copy_to_buffer instead.
+ **/
+static inline void *sg_map(struct scatterlist *sg, size_t offset, int flags)
+{
+	struct page *pg;
+	unsigned int pg_off;
+	void *ret;
+
+	offset += sg->offset;
+	pg = nth_page(sg_page(sg), offset >> PAGE_SHIFT);
+	pg_off = offset_in_page(offset);
+
+	if (flags & SG_KMAP_ATOMIC)
+		ret = kmap_atomic(pg) + pg_off;
+	else if (flags & SG_KMAP)
+		ret = kmap(pg) + pg_off;
+	else
+		ret = ERR_PTR(-EINVAL);
+
+	/*
+	 * In theory, this can't happen yet. Once we start adding
+	 * unmapable memory, it also shouldn't happen unless developers
+	 * start putting unmappable struct pages in sgls and passing
+	 * it to code that doesn't support it.
+	 */
+	BUG_ON(flags & SG_MAP_MUST_NOT_FAIL && IS_ERR(ret));
+
+	return ret;
+}
+
+/**
+ * sg_unmap - unmap a page that was mapped with sg_map_offset
+ * @sg:		SG entry
+ * @addr:	address returned by sg_map_offset
+ * @offset:	Offset into entry (same as specified for sg_map)
+ * @flags:	Flags, which are the same specified for sg_map
+ *
+ * Description:
+ *   Unmap the page that was mapped with sg_map_offset
+ **/
+static inline void sg_unmap(struct scatterlist *sg, void *addr,
+			    size_t offset, int flags)
+{
+	struct page *pg = nth_page(sg_page(sg), offset >> PAGE_SHIFT);
+	unsigned int pg_off = offset_in_page(offset);
+
+	if (flags & SG_KMAP_ATOMIC)
+		kunmap_atomic(addr - sg->offset - pg_off);
+	else if (flags & SG_KMAP)
+		kunmap(pg);
+}
+
 /**
  * sg_set_buf - Set sg entry to point at given data
  * @sg:		 SG entry