diff mbox

[RFC,ABI,V5,07/10] IB/core: Support getting IOCTL header/SGEs from kernel space

Message ID 1477579398-6875-8-git-send-email-matanb@mellanox.com (mailing list archive)
State RFC
Headers show

Commit Message

Matan Barak Oct. 27, 2016, 2:43 p.m. UTC
In order to allow compatability header, allow passing
ib_uverbs_ioctl_hdr and ib_uverbs_attr from kernel space.

Signed-off-by: Matan Barak <matanb@mellanox.com>
---
 drivers/infiniband/core/uverbs_ioctl.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

Comments

Christoph Hellwig Oct. 28, 2016, 6:59 a.m. UTC | #1
> +	mm_segment_t currentfs = get_fs();
>  
>  	if (!ib_dev)
>  		return -EIO;
> @@ -240,8 +242,10 @@ static long ib_uverbs_cmd_verbs(struct ib_device *ib_dev,
>  		goto out;
>  	}
>  
> +	set_fs(oldfs);
>  	err = uverbs_handle_action(buf, ctx->uattrs, hdr->num_attrs, ib_dev,
>  				   file, action, ctx->uverbs_attr_array);
> +	set_fs(currentfs);

Adding this magic in new code is not acceptable.  Any given API
must take either a kernel or a user pointer.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky Oct. 28, 2016, 3:16 p.m. UTC | #2
On Thu, Oct 27, 2016 at 11:59:43PM -0700, Christoph Hellwig wrote:
> > +	mm_segment_t currentfs = get_fs();
> >
> >  	if (!ib_dev)
> >  		return -EIO;
> > @@ -240,8 +242,10 @@ static long ib_uverbs_cmd_verbs(struct ib_device *ib_dev,
> >  		goto out;
> >  	}
> >
> > +	set_fs(oldfs);
> >  	err = uverbs_handle_action(buf, ctx->uattrs, hdr->num_attrs, ib_dev,
> >  				   file, action, ctx->uverbs_attr_array);
> > +	set_fs(currentfs);
>
> Adding this magic in new code is not acceptable.  Any given API
> must take either a kernel or a user pointer.

And it is indeed happen for new code. This magic is needed to allow legacy
write interface to be converted to new ioctl interface internally in
kernel.


> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig Oct. 28, 2016, 3:21 p.m. UTC | #3
On Fri, Oct 28, 2016 at 06:16:06PM +0300, Leon Romanovsky wrote:
> And it is indeed happen for new code. This magic is needed to allow legacy
> write interface to be converted to new ioctl interface internally in
> kernel.

You just added these statements which are by defintion new code.
Don't do that - create a clean kernel internal interface instead.

The canonical one would be to only pass kernel pointers in the internal
interface.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky Oct. 28, 2016, 3:33 p.m. UTC | #4
On Fri, Oct 28, 2016 at 08:21:38AM -0700, Christoph Hellwig wrote:
> On Fri, Oct 28, 2016 at 06:16:06PM +0300, Leon Romanovsky wrote:
> > And it is indeed happen for new code. This magic is needed to allow legacy
> > write interface to be converted to new ioctl interface internally in
> > kernel.
>
> You just added these statements which are by defintion new code.
> Don't do that - create a clean kernel internal interface instead.
>
> The canonical one would be to only pass kernel pointers in the internal
> interface.

Just to summarize, to be sure that I understood you correctly.

---------    --------------------
| write | -> | conversion logic | ---
---------    --------------------   |      ----------------------
                                    -----> | internal interface |
---------                           |      ----------------------
| ioctl | ---------------------------
---------

Am I right?

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig Oct. 28, 2016, 3:37 p.m. UTC | #5
On Fri, Oct 28, 2016 at 06:33:06PM +0300, Leon Romanovsky wrote:
> Just to summarize, to be sure that I understood you correctly.
> 
> ---------    --------------------
> | write | -> | conversion logic | ---
> ---------    --------------------   |      ----------------------
>                                     -----> | internal interface |
> ---------                           |      ----------------------
> | ioctl | ---------------------------
> ---------
> 
> Am I right?

Yes, as long as the write and ioctl boxes do the copy_{from,to}_user.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky Oct. 28, 2016, 3:46 p.m. UTC | #6
On Fri, Oct 28, 2016 at 08:37:25AM -0700, Christoph Hellwig wrote:
> On Fri, Oct 28, 2016 at 06:33:06PM +0300, Leon Romanovsky wrote:
> > Just to summarize, to be sure that I understood you correctly.
> >
> > ---------    --------------------
> > | write | -> | conversion logic | ---
> > ---------    --------------------   |      ----------------------
> >                                     -----> | internal interface |
> > ---------                           |      ----------------------
> > | ioctl | ---------------------------
> > ---------
> >
> > Am I right?
>
> Yes, as long as the write and ioctl boxes do the copy_{from,to}_user.

Thanks

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Matan Barak Oct. 30, 2016, 8:48 a.m. UTC | #7
On Fri, Oct 28, 2016 at 5:46 PM, Leon Romanovsky <leonro@mellanox.com> wrote:
> On Fri, Oct 28, 2016 at 08:37:25AM -0700, Christoph Hellwig wrote:
>> On Fri, Oct 28, 2016 at 06:33:06PM +0300, Leon Romanovsky wrote:
>> > Just to summarize, to be sure that I understood you correctly.
>> >
>> > ---------    --------------------
>> > | write | -> | conversion logic | ---
>> > ---------    --------------------   |      ----------------------
>> >                                     -----> | internal interface |
>> > ---------                           |      ----------------------
>> > | ioctl | ---------------------------
>> > ---------
>> >
>> > Am I right?
>>
>> Yes, as long as the write and ioctl boxes do the copy_{from,to}_user.
>
> Thanks
>

If we accept the limitations here (i.e - all commands attributes come
either from kernel or from user,
but you can't mix them - that's mean the write comparability layer
either needs to copy all attributes
or use a direct mapping for all of them), I could just either break
ib_uverbs_cmd_verbs to a a few functions
or just pass a callback of boxing the descriptors copy.

>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Nov. 8, 2016, 12:43 a.m. UTC | #8
On Sun, Oct 30, 2016 at 10:48:39AM +0200, Matan Barak wrote:
> On Fri, Oct 28, 2016 at 5:46 PM, Leon Romanovsky <leonro@mellanox.com> wrote:
> > On Fri, Oct 28, 2016 at 08:37:25AM -0700, Christoph Hellwig wrote:
> >> On Fri, Oct 28, 2016 at 06:33:06PM +0300, Leon Romanovsky wrote:
> >> > Just to summarize, to be sure that I understood you correctly.
> >> >
> >> > | write | -> | conversion logic | ---
> >> > | ioctl | ---------------------------
> >> >
> >> > Am I right?
> >>
> >> Yes, as long as the write and ioctl boxes do the copy_{from,to}_user.

> If we accept the limitations here (i.e - all commands attributes
> come either from kernel or from user, but you can't mix them -
> that's mean the write comparability layer either needs to copy all
> attributes or use a direct mapping for all of them), I could just
> either break ib_uverbs_cmd_verbs to a a few functions or just pass a
> callback of boxing the descriptors copy.

From what I saw in the series, this looks easy enough to fix..

Just lightly refactor things so that the write() compat layer can call
into the ioctl processor with an already prepared tlv list in kernel
memory and form such a list on the stack when doing the compat stuff.

The bigger problem is the tlv list pointers themselves, they have to
point to user memory so the compat layer can only do so much of a
transformation.

I guess another flag in the copy_from_user wrapper would do the trick
if we need it.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Matan Barak Nov. 9, 2016, 9:45 a.m. UTC | #9
On 08/11/2016 02:43, Jason Gunthorpe wrote:
> On Sun, Oct 30, 2016 at 10:48:39AM +0200, Matan Barak wrote:
>> On Fri, Oct 28, 2016 at 5:46 PM, Leon Romanovsky <leonro@mellanox.com> wrote:
>>> On Fri, Oct 28, 2016 at 08:37:25AM -0700, Christoph Hellwig wrote:
>>>> On Fri, Oct 28, 2016 at 06:33:06PM +0300, Leon Romanovsky wrote:
>>>>> Just to summarize, to be sure that I understood you correctly.
>>>>>
>>>>> | write | -> | conversion logic | ---
>>>>> | ioctl | ---------------------------
>>>>>
>>>>> Am I right?
>>>>
>>>> Yes, as long as the write and ioctl boxes do the copy_{from,to}_user.
>
>> If we accept the limitations here (i.e - all commands attributes
>> come either from kernel or from user, but you can't mix them -
>> that's mean the write comparability layer either needs to copy all
>> attributes or use a direct mapping for all of them), I could just
>> either break ib_uverbs_cmd_verbs to a a few functions or just pass a
>> callback of boxing the descriptors copy.
>
> From what I saw in the series, this looks easy enough to fix..
>
> Just lightly refactor things so that the write() compat layer can call
> into the ioctl processor with an already prepared tlv list in kernel
> memory and form such a list on the stack when doing the compat stuff.
>

Yeah, it's just an easy refactor of ib_uverbs_cmd_verbs and there's 
multiple ways of doing that :)

> The bigger problem is the tlv list pointers themselves, they have to
> point to user memory so the compat layer can only do so much of a
> transformation.
>
> I guess another flag in the copy_from_user wrapper would do the trick
> if we need it.
>

Currently we assume the payload itself is in user-space only so direct 
mapping is mandatory.
If we ever need to do something other than (bunch of consecutive write 
ABI struct fields) -> (attribute in the ioctl world), we'll have to box 
these copy macros/functions with copy_from_attr and copy_to_attr calls.

> Jason

Matan

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/core/uverbs_ioctl.c b/drivers/infiniband/core/uverbs_ioctl.c
index 9d56b17..c02b6d5 100644
--- a/drivers/infiniband/core/uverbs_ioctl.c
+++ b/drivers/infiniband/core/uverbs_ioctl.c
@@ -176,10 +176,11 @@  static int uverbs_handle_action(struct ib_uverbs_attr __user *uattr_ptr,
 	return ret;
 }
 
-static long ib_uverbs_cmd_verbs(struct ib_device *ib_dev,
-				struct ib_uverbs_file *file,
-				struct ib_uverbs_ioctl_hdr *hdr,
-				void __user *buf)
+long ib_uverbs_cmd_verbs(struct ib_device *ib_dev,
+			 struct ib_uverbs_file *file,
+			 struct ib_uverbs_ioctl_hdr *hdr,
+			 void __user *buf,
+			 mm_segment_t oldfs)
 {
 	const struct uverbs_type *type;
 	const struct uverbs_action *action;
@@ -193,6 +194,7 @@  static long ib_uverbs_cmd_verbs(struct ib_device *ib_dev,
 	} *ctx = NULL;
 	struct uverbs_attr *curr_attr;
 	size_t ctx_size;
+	mm_segment_t currentfs = get_fs();
 
 	if (!ib_dev)
 		return -EIO;
@@ -240,8 +242,10 @@  static long ib_uverbs_cmd_verbs(struct ib_device *ib_dev,
 		goto out;
 	}
 
+	set_fs(oldfs);
 	err = uverbs_handle_action(buf, ctx->uattrs, hdr->num_attrs, ib_dev,
 				   file, action, ctx->uverbs_attr_array);
+	set_fs(currentfs);
 out:
 	kfree(ctx);
 	return err;
@@ -296,7 +300,8 @@  long ib_uverbs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 		if (!down_read_trylock(&file->close_sem))
 			return -EIO;
 		err = ib_uverbs_cmd_verbs(ib_dev, file, &hdr,
-					  (__user void *)arg + sizeof(hdr));
+					  (__user void *)arg + sizeof(hdr),
+					  get_fs());
 		up_read(&file->close_sem);
 	}
 out: