Message ID | 20160512171846.6198.31415.stgit@scvm10.sc.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, May 12, 2016 at 10:18:47AM -0700, Dennis Dalessandro wrote: > + case HFI1_IOCTL_EP_INFO: > + case HFI1_IOCTL_EP_ERASE_CHIP: > + case HFI1_IOCTL_EP_ERASE_RANGE: > + case HFI1_IOCTL_EP_READ_RANGE: > + case HFI1_IOCTL_EP_WRITE_RANGE: > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; > + if (copy_from_user(&ucmd, > + (struct hfi11_cmd __user *)arg, > + sizeof(ucmd))) > + return -EFAULT; > + return handle_eprom_command(fp, &ucmd); I thought we agreed to get rid of this as well? It certainly does not belong here, and as a general rule, I don't think ioctls should be doing capable tests.. > +static inline int check_ioctl_access(unsigned int cmd, unsigned long arg) > +{ > + int read_cmd, write_cmd, read_ok, write_ok; > + > + read_cmd = _IOC_DIR(cmd) & _IOC_READ; > + write_cmd = _IOC_DIR(cmd) & _IOC_WRITE; > + write_ok = access_ok(VERIFY_WRITE, (void __user *)arg, _IOC_SIZE(cmd)); > + read_ok = access_ok(VERIFY_READ, (void __user *)arg, _IOC_SIZE(cmd)); > + > + if ((read_cmd && !write_ok) || (write_cmd && !read_ok)) > + return -EFAULT; This seems kind of goofy, didn't Ira say this is performance senstive? Driver shouldn't be open coding __get_user like that, IMHO. > +#define HFI1_IOCTL_RECV_CTRL \ > + _IOW(IB_IOCTL_MAGIC, HFI1_CMD_RECV_CTRL, int) Have you audited this? Confused why this is marked IOW when I see this: + case HFI1_IOCTL_RECV_CTRL: + ret = __get_user(uval, (int __user *)arg); Seeing many other examples. I stopped looking again Jason -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Thu, May 12, 2016 at 10:18:47AM -0700, Dennis Dalessandro wrote: > > + case HFI1_IOCTL_EP_INFO: > > + case HFI1_IOCTL_EP_ERASE_CHIP: > > + case HFI1_IOCTL_EP_ERASE_RANGE: > > + case HFI1_IOCTL_EP_READ_RANGE: > > + case HFI1_IOCTL_EP_WRITE_RANGE: > > + if (!capable(CAP_SYS_ADMIN)) > > + return -EPERM; > > + if (copy_from_user(&ucmd, > > + (struct hfi11_cmd __user *)arg, > > + sizeof(ucmd))) > > + return -EFAULT; > > + return handle_eprom_command(fp, &ucmd); > > I thought we agreed to get rid of this as well? It certainly does not > belong here, and as a general rule, I don't think ioctls should be > doing capable tests.. At least the drm ioctl code has similar capable test http://lxr.free-electrons.com/source/drivers/gpu/drm/drm_ioctl.c#L519 > > +static inline int check_ioctl_access(unsigned int cmd, unsigned long > arg) > > +{ > > + int read_cmd, write_cmd, read_ok, write_ok; > > + > > + read_cmd = _IOC_DIR(cmd) & _IOC_READ; > > + write_cmd = _IOC_DIR(cmd) & _IOC_WRITE; > > + write_ok = access_ok(VERIFY_WRITE, (void __user *)arg, > _IOC_SIZE(cmd)); > > + read_ok = access_ok(VERIFY_READ, (void __user *)arg, _IOC_SIZE(cmd)); > > + > > + if ((read_cmd && !write_ok) || (write_cmd && !read_ok)) > > + return -EFAULT; > > This seems kind of goofy, didn't Ira say this is performance senstive? > > Driver shouldn't be open coding __get_user like that, IMHO. FWIW, drm keeps an ioctl 'descriptor', which maintains a kernel copy of the ioctl cmd. It uses the kernel's version of the cmd for processing, instead of the cmd value passed in from user space. It doesn't open code get_user or do checks similar to what's here. But if there's concern that the cmd value cannot be trusted, a similar descriptor mechanism could be used here. - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, May 12, 2016 at 11:43:32AM -0600, Jason Gunthorpe wrote: >On Thu, May 12, 2016 at 10:18:47AM -0700, Dennis Dalessandro wrote: >> + case HFI1_IOCTL_EP_INFO: >> + case HFI1_IOCTL_EP_ERASE_CHIP: >> + case HFI1_IOCTL_EP_ERASE_RANGE: >> + case HFI1_IOCTL_EP_READ_RANGE: >> + case HFI1_IOCTL_EP_WRITE_RANGE: >> + if (!capable(CAP_SYS_ADMIN)) >> + return -EPERM; >> + if (copy_from_user(&ucmd, >> + (struct hfi11_cmd __user *)arg, >> + sizeof(ucmd))) >> + return -EFAULT; >> + return handle_eprom_command(fp, &ucmd); > >I thought we agreed to get rid of this as well? It certainly does not >belong here, and as a general rule, I don't think ioctls should be >doing capable tests.. Yeah. I left it in this patch set because this just "ports" our existing code to ioctl(). The eprom stuff is completely removed in another patch set that I posted shortly after this. It's at: http://marc.info/?l=linux-rdma&m=146307409301822&w=2 >> +static inline int check_ioctl_access(unsigned int cmd, unsigned long arg) >> +{ >> + int read_cmd, write_cmd, read_ok, write_ok; >> + >> + read_cmd = _IOC_DIR(cmd) & _IOC_READ; >> + write_cmd = _IOC_DIR(cmd) & _IOC_WRITE; >> + write_ok = access_ok(VERIFY_WRITE, (void __user *)arg, _IOC_SIZE(cmd)); >> + read_ok = access_ok(VERIFY_READ, (void __user *)arg, _IOC_SIZE(cmd)); >> + >> + if ((read_cmd && !write_ok) || (write_cmd && !read_ok)) >> + return -EFAULT; > >This seems kind of goofy, didn't Ira say this is performance senstive? I'll let Ira comment here on the performance aspect. I agree it looks goofy. Suggestion on how to make it look better? Or are you saying this is incorrect? >Driver shouldn't be open coding __get_user like that, IMHO. Can you explain what you mean here? We should not use __get_user()? > >> +#define HFI1_IOCTL_RECV_CTRL \ >> + _IOW(IB_IOCTL_MAGIC, HFI1_CMD_RECV_CTRL, int) > >Have you audited this? Confused why this is marked IOW when I see >this: > >+ case HFI1_IOCTL_RECV_CTRL: >+ ret = __get_user(uval, (int __user *)arg); > >Seeing many other examples. > >I stopped looking again _IOW means user is writing data to the device. So the device is reading data from the user. Or am I missing your point? -Denny -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, May 12, 2016 at 03:27:27PM -0400, Dennis Dalessandro wrote: > >I thought we agreed to get rid of this as well? It certainly does not > >belong here, and as a general rule, I don't think ioctls should be > >doing capable tests.. > > Yeah. I left it in this patch set because this just "ports" our existing > code to ioctl(). The eprom stuff is completely removed in another patch set > that I posted shortly after this. It's at: Adding code and then removing it in a later patch is not a best practice.. Just don't add it or define ioctl numbers at all.. > >>+static inline int check_ioctl_access(unsigned int cmd, unsigned long arg) > >>+{ > >>+ int read_cmd, write_cmd, read_ok, write_ok; > >>+ > >>+ read_cmd = _IOC_DIR(cmd) & _IOC_READ; > >>+ write_cmd = _IOC_DIR(cmd) & _IOC_WRITE; > >>+ write_ok = access_ok(VERIFY_WRITE, (void __user *)arg, _IOC_SIZE(cmd)); > >>+ read_ok = access_ok(VERIFY_READ, (void __user *)arg, _IOC_SIZE(cmd)); > >>+ > >>+ if ((read_cmd && !write_ok) || (write_cmd && !read_ok)) > >>+ return -EFAULT; > > > >This seems kind of goofy, didn't Ira say this is performance senstive? Well, calling access_ok twice when only once is typically needed is certainly not performant. Typically this check is done at every access via get_user/put_user/copy_to/from_user - why is it being hoisted like this? > >Driver shouldn't be open coding __get_user like that, IMHO. > > Can you explain what you mean here? We should not use __get_user()? Generally speaking, yes. Use get_user() that includes the correct access_ok. Unless there is a good reason to avoid it, the standard API should be used. > _IOW means user is writing data to the device. So the device is reading data > from the user. Or am I missing your point? You are right, I spaced on this when reading the above - 'write_ok' and 'write_cmd' seem like they should have been related, but really aren't. It is doing the right tests, just odd. (eg use names like write_cmd_ok, write_cmd for better clarity) Jason -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/12/2016 03:40 PM, Jason Gunthorpe wrote: > On Thu, May 12, 2016 at 03:27:27PM -0400, Dennis Dalessandro wrote: >>> I thought we agreed to get rid of this as well? It certainly does not >>> belong here, and as a general rule, I don't think ioctls should be >>> doing capable tests.. >> >> Yeah. I left it in this patch set because this just "ports" our existing >> code to ioctl(). The eprom stuff is completely removed in another patch set >> that I posted shortly after this. It's at: > > Adding code and then removing it in a later patch is not a best > practice.. Just don't add it or define ioctl numbers at all.. Yeah, but then that changes the nature of the patchset. It goes from being "We ported the existing write API to ioctl API" to "We ported existing write API to ioctl API and also changed the scope of the API in the process", which is considered bad coding practice (one stand alone change per commit). It's pretty 6 of one, half dozen of the other if you ask me. A better solution would have been to remove the EEPROM stuff from the write ioctl, then only port the remaining stuff. That would have avoided both coding issues, but I also understand how they got here, which is they did the ioctl conversion before they knew they were going to rip out the EEPROM code. In any case, the best fix would be to rebase the two series that are remaining and move any "rip out things like eeprom support" patches to prior to the ioctl patches and make it so that they rip out the write interface version of it instead, and then squash a second copy of the ioctl removal into this patch.
On Thu, May 12, 2016 at 03:48:16PM -0400, Doug Ledford wrote: > were going to rip out the EEPROM code. In any case, the best fix would > be to rebase the two series that are remaining and move any "rip out > things like eeprom support" patches to prior to the ioctl patches and > make it so that they rip out the write interface version of it instead, > and then squash a second copy of the ioctl removal into this patch. Yes, this would look best Jason -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, May 12, 2016 at 03:28:21PM -0600, Jason Gunthorpe wrote: >On Thu, May 12, 2016 at 03:48:16PM -0400, Doug Ledford wrote: >> were going to rip out the EEPROM code. In any case, the best fix would >> be to rebase the two series that are remaining and move any "rip out >> things like eeprom support" patches to prior to the ioctl patches and >> make it so that they rip out the write interface version of it instead, >> and then squash a second copy of the ioctl removal into this patch. > >Yes, this would look best The end result will end up being the exact same code so I don't have a problem doing that. Perhaps it would be better if I just combine the two patch sets into one series? I still need to look more into the kobj stuff about the cdev, if needed I'll add a patch for that as well. -Denny -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, May 12, 2016 at 01:40:06PM -0600, Jason Gunthorpe wrote: > On Thu, May 12, 2016 at 03:27:27PM -0400, Dennis Dalessandro wrote: > > > >>+static inline int check_ioctl_access(unsigned int cmd, unsigned long arg) > > >>+{ > > >>+ int read_cmd, write_cmd, read_ok, write_ok; > > >>+ > > >>+ read_cmd = _IOC_DIR(cmd) & _IOC_READ; > > >>+ write_cmd = _IOC_DIR(cmd) & _IOC_WRITE; > > >>+ write_ok = access_ok(VERIFY_WRITE, (void __user *)arg, _IOC_SIZE(cmd)); > > >>+ read_ok = access_ok(VERIFY_READ, (void __user *)arg, _IOC_SIZE(cmd)); > > >>+ > > >>+ if ((read_cmd && !write_ok) || (write_cmd && !read_ok)) > > >>+ return -EFAULT; > > > > > >This seems kind of goofy, didn't Ira say this is performance senstive? > > Well, calling access_ok twice when only once is typically needed is > certainly not performant. Typically this check is done at every access > via get_user/put_user/copy_to/from_user - why is it being hoisted like > this? > > > > >Driver shouldn't be open coding __get_user like that, IMHO. > > > > Can you explain what you mean here? We should not use __get_user()? > > Generally speaking, yes. Use get_user() that includes the correct > access_ok. Unless there is a good reason to avoid it, the standard API > should be used. I know this code was refactored while we were still submitting patches to Greg KH back in Nov/Dec. Part of this was cleaning up branch on error rather than success. Hence the check for access at the top of the function and early return. At that time I _thought_ there were multiple __get_users in some of the operations so a single common access_ok would speed those up. However, I don't see that happening any longer, so either I don't remember correctly, or we have made this cleaner. As it stands now I think you are correct we could use get_user and copy_to/from_user. Ira -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/staging/rdma/hfi1/common.h b/drivers/staging/rdma/hfi1/common.h index e9b6bb3..37ac229 100644 --- a/drivers/staging/rdma/hfi1/common.h +++ b/drivers/staging/rdma/hfi1/common.h @@ -178,7 +178,8 @@ HFI1_CAP_PKEY_CHECK | \ HFI1_CAP_NO_INTEGRITY) -#define HFI1_USER_SWVERSION ((HFI1_USER_SWMAJOR << 16) | HFI1_USER_SWMINOR) +#define HFI1_USER_SWVERSION ((HFI1_USER_SWMAJOR << HFI1_SWMAJOR_SHIFT) | \ + HFI1_USER_SWMINOR) #ifndef HFI1_KERN_TYPE #define HFI1_KERN_TYPE 0 @@ -349,6 +350,9 @@ struct hfi1_message_header { #define HFI1_BECN_MASK 1 #define HFI1_BECN_SMASK BIT(HFI1_BECN_SHIFT) +#define HFI1_PSM_IOC_BASE_SEQ 0x0 +#define HFI1_SNOOP_IOC_BASE_SEQ 0x80 /* leaves plenty of room for psm */ + static inline __u64 rhf_to_cpu(const __le32 *rbuf) { return __le64_to_cpu(*((__le64 *)rbuf)); diff --git a/drivers/staging/rdma/hfi1/diag.c b/drivers/staging/rdma/hfi1/diag.c index bb2409a..4cf6e8d 100644 --- a/drivers/staging/rdma/hfi1/diag.c +++ b/drivers/staging/rdma/hfi1/diag.c @@ -168,9 +168,7 @@ struct hfi1_link_info { */ #define SNOOP_CAPTURE_VERSION 0x1 -#define IB_IOCTL_MAGIC 0x1b /* See Documentation/ioctl-number.txt */ #define HFI1_SNOOP_IOC_MAGIC IB_IOCTL_MAGIC -#define HFI1_SNOOP_IOC_BASE_SEQ 0x80 #define HFI1_SNOOP_IOCGETLINKSTATE \ _IO(HFI1_SNOOP_IOC_MAGIC, HFI1_SNOOP_IOC_BASE_SEQ) @@ -1040,21 +1038,16 @@ static long hfi1_ioctl(struct file *fp, unsigned int cmd, unsigned long arg) struct hfi1_pportdata *ppd = NULL; unsigned int index; struct hfi1_link_info link_info; - int read_cmd, write_cmd, read_ok, write_ok; dd = hfi1_dd_from_sc_inode(fp->f_inode); if (!dd) return -ENODEV; - mode_flag = dd->hfi1_snoop.mode_flag; - read_cmd = _IOC_DIR(cmd) & _IOC_READ; - write_cmd = _IOC_DIR(cmd) & _IOC_WRITE; - write_ok = access_ok(VERIFY_WRITE, (void __user *)arg, _IOC_SIZE(cmd)); - read_ok = access_ok(VERIFY_READ, (void __user *)arg, _IOC_SIZE(cmd)); - - if ((read_cmd && !write_ok) || (write_cmd && !read_ok)) + if (check_ioctl_access(cmd, arg)) return -EFAULT; + mode_flag = dd->hfi1_snoop.mode_flag; + if (!capable(CAP_SYS_ADMIN)) return -EPERM; diff --git a/drivers/staging/rdma/hfi1/file_ops.c b/drivers/staging/rdma/hfi1/file_ops.c index c5be124..b4172d4 100644 --- a/drivers/staging/rdma/hfi1/file_ops.c +++ b/drivers/staging/rdma/hfi1/file_ops.c @@ -97,6 +97,8 @@ static int user_event_ack(struct hfi1_ctxtdata *, int, unsigned long); static int set_ctxt_pkey(struct hfi1_ctxtdata *, unsigned, u16); static int manage_rcvq(struct hfi1_ctxtdata *, unsigned, int); static int vma_fault(struct vm_area_struct *, struct vm_fault *); +static long hfi1_file_ioctl(struct file *fp, unsigned int cmd, + unsigned long arg); static const struct file_operations hfi1_file_ops = { .owner = THIS_MODULE, @@ -104,6 +106,7 @@ static const struct file_operations hfi1_file_ops = { .write_iter = hfi1_write_iter, .open = hfi1_file_open, .release = hfi1_file_close, + .unlocked_ioctl = hfi1_file_ioctl, .poll = hfi1_poll, .mmap = hfi1_file_mmap, .llseek = noop_llseek, @@ -176,6 +179,227 @@ static int hfi1_file_open(struct inode *inode, struct file *fp) return fp->private_data ? 0 : -ENOMEM; } +static long hfi1_file_ioctl(struct file *fp, unsigned int cmd, + unsigned long arg) +{ + struct hfi1_filedata *fd = fp->private_data; + struct hfi1_ctxtdata *uctxt = fd->uctxt; + struct hfi1_user_info uinfo; + struct hfi1_tid_info tinfo; + struct hfi1_cmd ucmd; + int uctxt_required = 1; + int ret = 0; + unsigned long addr; + int uval = 0; + unsigned long ul_uval = 0; + u16 uval16 = 0; + + ret = check_ioctl_access(cmd, arg); + if (ret) + return ret; + + switch (cmd) { + case HFI1_IOCTL_ASSIGN_CTXT: + uctxt_required = 0; /* assigned user context not required */ + break; + case HFI1_IOCTL_EP_INFO: + case HFI1_IOCTL_EP_ERASE_CHIP: + case HFI1_IOCTL_EP_ERASE_RANGE: + case HFI1_IOCTL_EP_READ_RANGE: + case HFI1_IOCTL_EP_WRITE_RANGE: + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + if (copy_from_user(&ucmd, + (struct hfi11_cmd __user *)arg, + sizeof(ucmd))) + return -EFAULT; + return handle_eprom_command(fp, &ucmd); + } + + if (uctxt_required && !uctxt) + return -EINVAL; + + /* Checked for root/context process the IOCTL */ + switch (cmd) { + case HFI1_IOCTL_ASSIGN_CTXT: + if (copy_from_user(&uinfo, + (struct hfi1_user_info __user *)arg, + sizeof(uinfo))) + return -EFAULT; + + ret = assign_ctxt(fp, &uinfo); + if (ret < 0) + return ret; + setup_ctxt(fp); + if (ret) + return ret; + ret = user_init(fp); + break; + case HFI1_IOCTL_CTXT_INFO: + ret = get_ctxt_info(fp, (void __user *)(unsigned long)arg, + sizeof(struct hfi1_ctxt_info)); + break; + case HFI1_IOCTL_USER_INFO: + ret = get_base_info(fp, (void __user *)(unsigned long)arg, + sizeof(struct hfi1_base_info)); + break; + case HFI1_IOCTL_CREDIT_UPD: + if (uctxt && uctxt->sc) + sc_return_credits(uctxt->sc); + break; + + case HFI1_IOCTL_TID_UPDATE: + if (copy_from_user(&tinfo, + (struct hfi11_tid_info __user *)arg, + sizeof(tinfo))) + return -EFAULT; + + ret = hfi1_user_exp_rcv_setup(fp, &tinfo); + if (!ret) { + /* + * Copy the number of tidlist entries we used + * and the length of the buffer we registered. + * These fields are adjacent in the structure so + * we can copy them at the same time. + */ + addr = arg + offsetof(struct hfi1_tid_info, tidcnt); + if (copy_to_user((void __user *)addr, &tinfo.tidcnt, + sizeof(tinfo.tidcnt) + + sizeof(tinfo.length))) + ret = -EFAULT; + } + break; + + case HFI1_IOCTL_TID_FREE: + if (copy_from_user(&tinfo, + (struct hfi11_tid_info __user *)arg, + sizeof(tinfo))) + return -EFAULT; + + ret = hfi1_user_exp_rcv_clear(fp, &tinfo); + if (ret) + break; + addr = arg + offsetof(struct hfi1_tid_info, tidcnt); + if (copy_to_user((void __user *)addr, &tinfo.tidcnt, + sizeof(tinfo.tidcnt))) + ret = -EFAULT; + break; + + case HFI1_IOCTL_TID_INVAL_READ: + if (copy_from_user(&tinfo, + (struct hfi11_tid_info __user *)arg, + sizeof(tinfo))) + return -EFAULT; + + ret = hfi1_user_exp_rcv_invalid(fp, &tinfo); + if (ret) + break; + addr = arg + offsetof(struct hfi1_tid_info, tidcnt); + if (copy_to_user((void __user *)addr, &tinfo.tidcnt, + sizeof(tinfo.tidcnt))) + ret = -EFAULT; + break; + + case HFI1_IOCTL_RECV_CTRL: + ret = __get_user(uval, (int __user *)arg); + if (ret != 0) + return -EFAULT; + ret = manage_rcvq(uctxt, fd->subctxt, uval); + break; + + case HFI1_IOCTL_POLL_TYPE: + ret = __get_user(uval, (int __user *)arg); + if (ret != 0) + return -EFAULT; + uctxt->poll_type = (typeof(uctxt->poll_type))uval; + break; + + case HFI1_IOCTL_ACK_EVENT: + ret = __get_user(ul_uval, (unsigned long __user *)arg); + if (ret != 0) + return -EFAULT; + ret = user_event_ack(uctxt, fd->subctxt, ul_uval); + break; + + case HFI1_IOCTL_SET_PKEY: + ret = __get_user(uval16, (u16 __user *)arg); + if (ret != 0) + return -EFAULT; + if (HFI1_CAP_IS_USET(PKEY_CHECK)) + ret = set_ctxt_pkey(uctxt, fd->subctxt, uval16); + else + ret = -EPERM; + break; + + case HFI1_IOCTL_CTXT_RESET: { + struct send_context *sc; + struct hfi1_devdata *dd; + + if (!uctxt || !uctxt->dd || !uctxt->sc) { + ret = -EINVAL; + break; + } + /* + * There is no protection here. User level has to + * guarantee that no one will be writing to the send + * context while it is being re-initialized. + * If user level breaks that guarantee, it will break + * it's own context and no one else's. + */ + dd = uctxt->dd; + sc = uctxt->sc; + /* + * Wait until the interrupt handler has marked the + * context as halted or frozen. Report error if we time + * out. + */ + wait_event_interruptible_timeout( + sc->halt_wait, (sc->flags & SCF_HALTED), + msecs_to_jiffies(SEND_CTXT_HALT_TIMEOUT)); + if (!(sc->flags & SCF_HALTED)) { + ret = -ENOLCK; + break; + } + /* + * If the send context was halted due to a Freeze, + * wait until the device has been "unfrozen" before + * resetting the context. + */ + if (sc->flags & SCF_FROZEN) { + wait_event_interruptible_timeout( + dd->event_queue, + !(ACCESS_ONCE(dd->flags) & HFI1_FROZEN), + msecs_to_jiffies(SEND_CTXT_HALT_TIMEOUT)); + if (dd->flags & HFI1_FROZEN) { + ret = -ENOLCK; + break; + } + if (dd->flags & HFI1_FORCED_FREEZE) { + /* + * Don't allow context reset if we are into + * forced freeze + */ + ret = -ENODEV; + break; + } + sc_disable(sc); + ret = sc_enable(sc); + hfi1_rcvctrl(dd, HFI1_RCVCTRL_CTXT_ENB, + uctxt->ctxt); + } else { + ret = sc_restart(sc); + } + if (!ret) + sc_return_credits(sc); + break; + } + default: + return -EINVAL; + } + + return ret; +} + static ssize_t hfi1_file_write(struct file *fp, const char __user *data, size_t count, loff_t *offset) { diff --git a/drivers/staging/rdma/hfi1/hfi.h b/drivers/staging/rdma/hfi1/hfi.h index 7b78d56..7351898 100644 --- a/drivers/staging/rdma/hfi1/hfi.h +++ b/drivers/staging/rdma/hfi1/hfi.h @@ -1946,4 +1946,18 @@ static inline u32 qsfp_resource(struct hfi1_devdata *dd) int hfi1_tempsense_rd(struct hfi1_devdata *dd, struct hfi1_temp *temp); +static inline int check_ioctl_access(unsigned int cmd, unsigned long arg) +{ + int read_cmd, write_cmd, read_ok, write_ok; + + read_cmd = _IOC_DIR(cmd) & _IOC_READ; + write_cmd = _IOC_DIR(cmd) & _IOC_WRITE; + write_ok = access_ok(VERIFY_WRITE, (void __user *)arg, _IOC_SIZE(cmd)); + read_ok = access_ok(VERIFY_READ, (void __user *)arg, _IOC_SIZE(cmd)); + + if ((read_cmd && !write_ok) || (write_cmd && !read_ok)) + return -EFAULT; + + return 0; +} #endif /* _HFI1_KERNEL_H */ diff --git a/include/uapi/rdma/hfi/hfi1_user.h b/include/uapi/rdma/hfi/hfi1_user.h index 46f6caa..37a9697 100644 --- a/include/uapi/rdma/hfi/hfi1_user.h +++ b/include/uapi/rdma/hfi/hfi1_user.h @@ -78,6 +78,11 @@ #define HFI1_USER_SWMINOR 0 /* + * We will encode the major/minor inside a single 32bit version number. + */ +#define HFI1_SWMAJOR_SHIFT 16 + +/* * Set of HW and driver capability/feature bits. * These bit values are used to configure enabled/disabled HW and * driver features. The same set of bits are communicated to user @@ -142,6 +147,48 @@ #define HFI1_CMD_EP_READ_RANGE 76 /* read EPROM range */ #define HFI1_CMD_EP_WRITE_RANGE 77 /* write EPROM range */ +/* + * User IOCTLs can not go above 128 if they do then see common.h and change the + * base for the snoop ioctl + */ +#define IB_IOCTL_MAGIC 0x1b /* See Documentation/ioctl/ioctl-number.txt */ + +struct hfi1_cmd; +#define HFI1_IOCTL_ASSIGN_CTXT \ + _IOWR(IB_IOCTL_MAGIC, HFI1_CMD_ASSIGN_CTXT, struct hfi1_user_info) +#define HFI1_IOCTL_CTXT_INFO \ + _IOW(IB_IOCTL_MAGIC, HFI1_CMD_CTXT_INFO, struct hfi1_ctxt_info) +#define HFI1_IOCTL_USER_INFO \ + _IOW(IB_IOCTL_MAGIC, HFI1_CMD_USER_INFO, struct hfi1_base_info) +#define HFI1_IOCTL_TID_UPDATE \ + _IOWR(IB_IOCTL_MAGIC, HFI1_CMD_TID_UPDATE, struct hfi1_tid_info) +#define HFI1_IOCTL_TID_FREE \ + _IOWR(IB_IOCTL_MAGIC, HFI1_CMD_TID_FREE, struct hfi1_tid_info) +#define HFI1_IOCTL_CREDIT_UPD \ + _IO(IB_IOCTL_MAGIC, HFI1_CMD_CREDIT_UPD) +#define HFI1_IOCTL_RECV_CTRL \ + _IOW(IB_IOCTL_MAGIC, HFI1_CMD_RECV_CTRL, int) +#define HFI1_IOCTL_POLL_TYPE \ + _IOW(IB_IOCTL_MAGIC, HFI1_CMD_POLL_TYPE, int) +#define HFI1_IOCTL_ACK_EVENT \ + _IOW(IB_IOCTL_MAGIC, HFI1_CMD_ACK_EVENT, unsigned long) +#define HFI1_IOCTL_SET_PKEY \ + _IOW(IB_IOCTL_MAGIC, HFI1_CMD_SET_PKEY, __u16) +#define HFI1_IOCTL_CTXT_RESET \ + _IO(IB_IOCTL_MAGIC, HFI1_CMD_CTXT_RESET) +#define HFI1_IOCTL_TID_INVAL_READ \ + _IOWR(IB_IOCTL_MAGIC, HFI1_CMD_TID_INVAL_READ, struct hfi1_tid_info) +#define HFI1_IOCTL_EP_INFO \ + _IOWR(IB_IOCTL_MAGIC, HFI1_CMD_EP_INFO, struct hfi1_cmd) +#define HFI1_IOCTL_EP_ERASE_CHIP \ + _IOWR(IB_IOCTL_MAGIC, HFI1_CMD_EP_ERASE_CHIP, struct hfi1_cmd) +#define HFI1_IOCTL_EP_ERASE_RANGE \ + _IOWR(IB_IOCTL_MAGIC, HFI1_CMD_EP_ERASE_RANGE, struct hfi1_cmd) +#define HFI1_IOCTL_EP_READ_RANGE \ + _IOWR(IB_IOCTL_MAGIC, HFI1_CMD_EP_READ_RANGE, struct hfi1_cmd) +#define HFI1_IOCTL_EP_WRITE_RANGE \ + _IOWR(IB_IOCTL_MAGIC, HFI1_CMD_EP_WRITE_RANGE, struct hfi1_cmd) + #define _HFI1_EVENT_FROZEN_BIT 0 #define _HFI1_EVENT_LINKDOWN_BIT 1 #define _HFI1_EVENT_LID_CHANGE_BIT 2 @@ -242,6 +289,7 @@ struct hfi1_tid_info { __u32 length; }; +/* hfi1_cmd is used for EPROM commands only */ struct hfi1_cmd { __u32 type; /* command type */ __u32 len; /* length of struct pointed to by add */