Message ID | 20190522164119.24139-4-emil.l.velikov@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/5] vmwgfx: drop empty lastclose stub | expand |
Hi, Emil, On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote: > From: Emil Velikov <emil.velikov@collabora.com> > > Drop the custom ioctl io encoding check - core drm does it for us. I fail to see where the core does this, or do I miss something? Thanks, Thomas > > Cc: "VMware Graphics" <linux-graphics-maintainer@vmware.com> > Cc: Thomas Hellstrom <thellstrom@vmware.com> > Cc: Daniel Vetter <daniel@ffwll.ch> > Signed-off-by: Emil Velikov <emil.velikov@collabora.com> > --- > drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 9 --------- > 1 file changed, 9 deletions(-) > > diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c > b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c > index 2cb6ae219e43..f65542639b55 100644 > --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c > +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c > @@ -1147,9 +1147,6 @@ static long vmw_generic_ioctl(struct file > *filp, unsigned int cmd, > return -EACCES; > } > > - if (unlikely(ioctl->cmd != cmd)) > - goto out_io_encoding; > - > flags = ioctl->flags; > } else if (!drm_ioctl_flags(nr, &flags)) > return -EINVAL; > @@ -1169,12 +1166,6 @@ static long vmw_generic_ioctl(struct file > *filp, unsigned int cmd, > ttm_read_unlock(&vmaster->lock); > > return ret; > - > -out_io_encoding: > - DRM_ERROR("Invalid command format, ioctl %d\n", > - nr - DRM_COMMAND_BASE); > - > - return -EINVAL; > } > > static long vmw_unlocked_ioctl(struct file *filp, unsigned int cmd,
On 2019/05/23, Thomas Hellstrom wrote: > Hi, Emil, > > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote: > > From: Emil Velikov <emil.velikov@collabora.com> > > > > Drop the custom ioctl io encoding check - core drm does it for us. > > I fail to see where the core does this, or do I miss something? drm_ioctl() allows for the encoding to be changed and attributes that only the appropriate size is copied in/out of the kernel. Technically the function is more relaxed relative to the vmwgfx check, yet seems perfectly reasonable. Is there any corner-case that isn't but should be handled in drm_ioctl()? -Emil
On Fri, 2019-05-24 at 13:14 +0100, Emil Velikov wrote: > On 2019/05/23, Thomas Hellstrom wrote: > > Hi, Emil, > > > > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote: > > > From: Emil Velikov <emil.velikov@collabora.com> > > > > > > Drop the custom ioctl io encoding check - core drm does it for > > > us. > > > > I fail to see where the core does this, or do I miss something? > > drm_ioctl() allows for the encoding to be changed and attributes that > only the > appropriate size is copied in/out of the kernel. > > Technically the function is more relaxed relative to the vmwgfx > check, yet > seems perfectly reasonable. > > Is there any corner-case that isn't but should be handled in > drm_ioctl()? I'd like to turn the question around and ask whether there's a reason we should relax the vmwgfx test? In the past it has trapped quite a few user-space errors. Thanks, Thomas > > -Emil
On 2019/05/24, Thomas Hellstrom wrote: > On Fri, 2019-05-24 at 13:14 +0100, Emil Velikov wrote: > > On 2019/05/23, Thomas Hellstrom wrote: > > > Hi, Emil, > > > > > > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote: > > > > From: Emil Velikov <emil.velikov@collabora.com> > > > > > > > > Drop the custom ioctl io encoding check - core drm does it for > > > > us. > > > > > > I fail to see where the core does this, or do I miss something? > > > > drm_ioctl() allows for the encoding to be changed and attributes that > > only the > > appropriate size is copied in/out of the kernel. > > > > Technically the function is more relaxed relative to the vmwgfx > > check, yet > > seems perfectly reasonable. > > > > Is there any corner-case that isn't but should be handled in > > drm_ioctl()? > > I'd like to turn the question around and ask whether there's a reason > we should relax the vmwgfx test? In the past it has trapped quite a few > user-space errors. > The way I see it either: - the check, as-is, is unnessesary, or - it is needed, and we should do something equivalent for all of DRM We had a very long brainstorming session with a colleague and we could not see any cases where this would cause a problem. If you recall anything concrete please let me know - I would be more than happy to take a closer look. Thanks Emil
Hi, Emil On Fri, 2019-05-24 at 16:26 +0100, Emil Velikov wrote: > On 2019/05/24, Thomas Hellstrom wrote: > > On Fri, 2019-05-24 at 13:14 +0100, Emil Velikov wrote: > > > On 2019/05/23, Thomas Hellstrom wrote: > > > > Hi, Emil, > > > > > > > > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote: > > > > > From: Emil Velikov <emil.velikov@collabora.com> > > > > > > > > > > Drop the custom ioctl io encoding check - core drm does it > > > > > for > > > > > us. > > > > > > > > I fail to see where the core does this, or do I miss something? > > > > > > drm_ioctl() allows for the encoding to be changed and attributes > > > that > > > only the > > > appropriate size is copied in/out of the kernel. > > > > > > Technically the function is more relaxed relative to the vmwgfx > > > check, yet > > > seems perfectly reasonable. > > > > > > Is there any corner-case that isn't but should be handled in > > > drm_ioctl()? > > > > I'd like to turn the question around and ask whether there's a > > reason > > we should relax the vmwgfx test? In the past it has trapped quite a > > few > > user-space errors. > > > The way I see it either: > - the check, as-is, is unnessesary, or > - it is needed, and we should do something equivalent for all of DRM > > We had a very long brainstorming session with a colleague and we > could not see > any cases where this would cause a problem. If you recall anything > concrete > please let me know - I would be more than happy to take a closer > look. If you have a good reason to drop an ioctl sanity check, I'd be perfectly happy to do it. To me, a good reason even includes "I have a non-open-source customer having problems with this check" because of reason etc. etc. as long as I have a way to evaluate those reasons and determine if they're valid or not. "No other drm driver nor the core is doing this" is NOT a valid reason to me. In particular if the check is not affecting performance. So unless you provide additional reasons to drop this check, it's a solid NAK from my side. Thanks, Thomas > > Thanks > Emil
On Sat, 2019-05-25 at 00:39 +0200, Thomas Hellström wrote: > Hi, Emil > > On Fri, 2019-05-24 at 16:26 +0100, Emil Velikov wrote: > > On 2019/05/24, Thomas Hellstrom wrote: > > > On Fri, 2019-05-24 at 13:14 +0100, Emil Velikov wrote: > > > > On 2019/05/23, Thomas Hellstrom wrote: > > > > > Hi, Emil, > > > > > > > > > > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote: > > > > > > From: Emil Velikov <emil.velikov@collabora.com> > > > > > > > > > > > > Drop the custom ioctl io encoding check - core drm does it > > > > > > for > > > > > > us. > > > > > > > > > > I fail to see where the core does this, or do I miss > > > > > something? > > > > > > > > drm_ioctl() allows for the encoding to be changed and > > > > attributes > > > > that > > > > only the > > > > appropriate size is copied in/out of the kernel. > > > > > > > > Technically the function is more relaxed relative to the vmwgfx > > > > check, yet > > > > seems perfectly reasonable. > > > > > > > > Is there any corner-case that isn't but should be handled in > > > > drm_ioctl()? > > > > > > I'd like to turn the question around and ask whether there's a > > > reason > > > we should relax the vmwgfx test? In the past it has trapped quite > > > a > > > few > > > user-space errors. > > > > > The way I see it either: > > - the check, as-is, is unnessesary, or > > - it is needed, and we should do something equivalent for all of > > DRM > > > > We had a very long brainstorming session with a colleague and we > > could not see > > any cases where this would cause a problem. If you recall anything > > concrete > > please let me know - I would be more than happy to take a closer > > look. > > If you have a good reason to drop an ioctl sanity check, I'd be > perfectly happy to do it. To me, a good reason even includes "I have > a > non-open-source customer having problems with this check" because of > reason etc. etc. as long as I have a way to evaluate those reasons > and > determine if they're valid or not. "No other drm driver nor the core > is > doing this" is NOT a valid reason to me. In particular if the check > is > not affecting performance. So unless you provide additional reasons > to > drop this check, it's a solid NAK from my side. To clarify my point of view a bit, this check is useful to early catch userspace using incorrect flags and sizes, which otherwise might make it out to distros and after that, introducing a check like this would be impossible, since it might break old user-space. For the same reason it would probably be very difficult to introduce it in core drm. Thanks, Thomas > > Thanks, > Thomas > > > > Thanks > > Emil
On 2019/05/25, Thomas Hellstrom wrote: > On Sat, 2019-05-25 at 00:39 +0200, Thomas Hellström wrote: > > Hi, Emil > > > > On Fri, 2019-05-24 at 16:26 +0100, Emil Velikov wrote: > > > On 2019/05/24, Thomas Hellstrom wrote: > > > > On Fri, 2019-05-24 at 13:14 +0100, Emil Velikov wrote: > > > > > On 2019/05/23, Thomas Hellstrom wrote: > > > > > > Hi, Emil, > > > > > > > > > > > > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote: > > > > > > > From: Emil Velikov <emil.velikov@collabora.com> > > > > > > > > > > > > > > Drop the custom ioctl io encoding check - core drm does it > > > > > > > for > > > > > > > us. > > > > > > > > > > > > I fail to see where the core does this, or do I miss > > > > > > something? > > > > > > > > > > drm_ioctl() allows for the encoding to be changed and > > > > > attributes > > > > > that > > > > > only the > > > > > appropriate size is copied in/out of the kernel. > > > > > > > > > > Technically the function is more relaxed relative to the vmwgfx > > > > > check, yet > > > > > seems perfectly reasonable. > > > > > > > > > > Is there any corner-case that isn't but should be handled in > > > > > drm_ioctl()? > > > > > > > > I'd like to turn the question around and ask whether there's a > > > > reason > > > > we should relax the vmwgfx test? In the past it has trapped quite > > > > a > > > > few > > > > user-space errors. > > > > > > > The way I see it either: > > > - the check, as-is, is unnessesary, or > > > - it is needed, and we should do something equivalent for all of > > > DRM > > > > > > We had a very long brainstorming session with a colleague and we > > > could not see > > > any cases where this would cause a problem. If you recall anything > > > concrete > > > please let me know - I would be more than happy to take a closer > > > look. > > > > If you have a good reason to drop an ioctl sanity check, I'd be > > perfectly happy to do it. To me, a good reason even includes "I have > > a > > non-open-source customer having problems with this check" because of > > reason etc. etc. as long as I have a way to evaluate those reasons > > and > > determine if they're valid or not. "No other drm driver nor the core > > is > > doing this" is NOT a valid reason to me. In particular if the check > > is > > not affecting performance. So unless you provide additional reasons > > to > > drop this check, it's a solid NAK from my side. > > To clarify my point of view a bit, this check is useful to early catch > userspace using incorrect flags and sizes, which otherwise might make > it out to distros and after that, introducing a check like this would > be impossible, since it might break old user-space. For the same reason > it would probably be very difficult to introduce it in core drm. > I think we might be talking past each other, let's take a step back: - as of previous patch, all of vmwgfx ioctls size is consistently handled by the core - handling of featue flags, as always, is responsibility of the driver ifself - with this patch, ioctl direction is also handled by core. Here core ensures we only copy in/out as much data as the kernel implementation can handle. Let's consider the following real world example - msm and virtio_gpu. An in field of an _IOW ioctl becomes in/out aka _IORW ioctl. - we add a flag to annotate/request the out, as always invalid flags are return -EINVAL - we change the ioctl encoding As currently handled by core DRM, old kernel/new userspace and vice-versa works just fine. Sadly, vmwgfx will error out, while it could be avoided. As said above, I'll gladly adjust core and/or others, if this relaxed approach causes an issue somewhere. A specific use-case, real or hypothetical will be appreciated. All this is part of an "evil" plan of mine, to port cool features from vmwgfx to core and effectively remove the vmw_generic_ioctl() wrapper. Hope the bigger picture is clearer now, if not please let me know. Thanks Emil
Hi, Emil, On Mon, 2019-05-27 at 10:08 +0100, Emil Velikov wrote: > On 2019/05/25, Thomas Hellstrom wrote: > > On Sat, 2019-05-25 at 00:39 +0200, Thomas Hellström wrote: > > > Hi, Emil > > > > > > On Fri, 2019-05-24 at 16:26 +0100, Emil Velikov wrote: > > > > On 2019/05/24, Thomas Hellstrom wrote: > > > > > On Fri, 2019-05-24 at 13:14 +0100, Emil Velikov wrote: > > > > > > On 2019/05/23, Thomas Hellstrom wrote: > > > > > > > Hi, Emil, > > > > > > > > > > > > > > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote: > > > > > > > > From: Emil Velikov <emil.velikov@collabora.com> > > > > > > > > > > > > > > > > Drop the custom ioctl io encoding check - core drm does > > > > > > > > it > > > > > > > > for > > > > > > > > us. > > > > > > > > > > > > > > I fail to see where the core does this, or do I miss > > > > > > > something? > > > > > > > > > > > > drm_ioctl() allows for the encoding to be changed and > > > > > > attributes > > > > > > that > > > > > > only the > > > > > > appropriate size is copied in/out of the kernel. > > > > > > > > > > > > Technically the function is more relaxed relative to the > > > > > > vmwgfx > > > > > > check, yet > > > > > > seems perfectly reasonable. > > > > > > > > > > > > Is there any corner-case that isn't but should be handled > > > > > > in > > > > > > drm_ioctl()? > > > > > > > > > > I'd like to turn the question around and ask whether there's > > > > > a > > > > > reason > > > > > we should relax the vmwgfx test? In the past it has trapped > > > > > quite > > > > > a > > > > > few > > > > > user-space errors. > > > > > > > > > The way I see it either: > > > > - the check, as-is, is unnessesary, or > > > > - it is needed, and we should do something equivalent for all > > > > of > > > > DRM > > > > > > > > We had a very long brainstorming session with a colleague and > > > > we > > > > could not see > > > > any cases where this would cause a problem. If you recall > > > > anything > > > > concrete > > > > please let me know - I would be more than happy to take a > > > > closer > > > > look. > > > > > > If you have a good reason to drop an ioctl sanity check, I'd be > > > perfectly happy to do it. To me, a good reason even includes "I > > > have > > > a > > > non-open-source customer having problems with this check" because > > > of > > > reason etc. etc. as long as I have a way to evaluate those > > > reasons > > > and > > > determine if they're valid or not. "No other drm driver nor the > > > core > > > is > > > doing this" is NOT a valid reason to me. In particular if the > > > check > > > is > > > not affecting performance. So unless you provide additional > > > reasons > > > to > > > drop this check, it's a solid NAK from my side. > > > > To clarify my point of view a bit, this check is useful to early > > catch > > userspace using incorrect flags and sizes, which otherwise might > > make > > it out to distros and after that, introducing a check like this > > would > > be impossible, since it might break old user-space. For the same > > reason > > it would probably be very difficult to introduce it in core drm. > > > I think we might be talking past each other, let's take a step back: > > - as of previous patch, all of vmwgfx ioctls size is consistently > handled by the core I don't think I follow you here, AFAICT patch 3/5 only affects and relaxes the execbuf checking (and in fact a little more than I would like)? > - handling of featue flags, as always, is responsibility of the > driver > ifself > - with this patch, ioctl direction is also handled by core. > > Here core ensures we only copy in/out as much data as the kernel > implementation can handle. > > > Let's consider the following real world example - msm and virtio_gpu. > > An in field of an _IOW ioctl becomes in/out aka _IORW ioctl. > - we add a flag to annotate/request the out, as always invalid flags > are return -EINVAL > - we change the ioctl encoding > > As currently handled by core DRM, old kernel/new userspace and > vice-versa works just fine. Sadly, vmwgfx will error out, while it > could > be avoided. IMO basically we have a tradeoff between strict checking in this case, and new user-space vs old kernel "hazzle-free" transition in the relaxed case. > > As said above, I'll gladly adjust core and/or others, if this relaxed > approach causes an issue somewhere. A specific use-case, real or > hypothetical will be appreciated. To me there are two important reasons to keep the strict approach. 1) Avoid user-space mistakes early in the development cycle. We can't distinguish between buggy user-space and "new" user-space. This is important because of [a]) below. 2) Catch a lot of fuzzer combinations and error out early instead of forwarding them to the ioctl function where they may cause harm. I think the new user-space vs old kernel can be handled nicely in user- space with feature flags or API versions. That's the way we've handled them up to now? [a] But you probably can't move the stricter approach to core drm, or even to core drm ioctls, since that may break old user-space we might not even know about. Relaxing this now may put the vmwgfx-specific ioctls in the same situation. I'd like to have this check also in core drm, but alas, I think it's impossible. > > > All this is part of an "evil" plan of mine, to port cool features > from > vmwgfx to core and effectively remove the vmw_generic_ioctl() > wrapper. I'm not completely sure I understand what role the vmw_generic_ioctl() wrapper plays in this? I mean, the advantage of a helper approach instead of a middle layer approach is that you are free to use the helpers and amend them if desirable. If everybody is forced to use exactly the same helpers, then we're basically back at the middle layer approach? /Thomas > > > Hope the bigger picture is clearer now, if not please let me know. > > Thanks > Emil
Hi Thomas, On 2019/05/27, Thomas Hellstrom wrote: > > I think we might be talking past each other, let's take a step back: > > > > - as of previous patch, all of vmwgfx ioctls size is consistently > > handled by the core > > I don't think I follow you here, AFAICT patch 3/5 only affects and > relaxes the execbuf checking (and in fact a little more than I would > like)? > Precisely, it makes execbuf ioctl behave like all other ioctls - both vmwgfx and rest of drm. > > - handling of featue flags, as always, is responsibility of the > > driver > > ifself > > - with this patch, ioctl direction is also handled by core. > > > > Here core ensures we only copy in/out as much data as the kernel > > implementation can handle. > > > > > > Let's consider the following real world example - msm and virtio_gpu. > > > > An in field of an _IOW ioctl becomes in/out aka _IORW ioctl. > > - we add a flag to annotate/request the out, as always invalid flags > > are return -EINVAL > > - we change the ioctl encoding > > > > As currently handled by core DRM, old kernel/new userspace and > > vice-versa works just fine. Sadly, vmwgfx will error out, while it > > could > > be avoided. > > IMO basically we have a tradeoff between strict checking in this case, > and new user-space vs old kernel "hazzle-free" transition in the > relaxed case. > Precisely. If I read the code correctly, ATM new userspace will fail against old kernels. Unless userspace writes two versions of the ioctl - with with each encoding. > > > > As said above, I'll gladly adjust core and/or others, if this relaxed > > approach causes an issue somewhere. A specific use-case, real or > > hypothetical will be appreciated. > > To me there are two important reasons to keep the strict approach. > > 1) Avoid user-space mistakes early in the development cycle. We can't > distinguish between buggy user-space and "new" user-space. This is > important because of [a]) below. > Can you provide a concrete example, please? > 2) Catch a lot of fuzzer combinations and error out early instead of > forwarding them to the ioctl function where they may cause harm. > Struggling to see why this is a problem? At some point the fuzzer will get past this first line of defence, so we want to make the rest of the ioctl is robust. > I think the new user-space vs old kernel can be handled nicely in user- > space with feature flags or API versions. That's the way we've handled > them up to now? > How is a feature flag doing to help if the encoding changes from _IOW to _IORW? Thanks Emil
On 5/27/19 2:35 PM, Emil Velikov wrote: > Hi Thomas, > > On 2019/05/27, Thomas Hellstrom wrote: > >>> I think we might be talking past each other, let's take a step back: >>> >>> - as of previous patch, all of vmwgfx ioctls size is consistently >>> handled by the core >> I don't think I follow you here, AFAICT patch 3/5 only affects and >> relaxes the execbuf checking (and in fact a little more than I would >> like)? >> > Precisely, it makes execbuf ioctl behave like all other ioctls - both > vmwgfx and rest of drm. But we're still enforcing a non-relaxed size check for the other vmwgfx private ioctls, right? Which is relaxed, together with the directions, in this commit? (Not that it matters much to the discussion, though). > >>> - handling of featue flags, as always, is responsibility of the >>> driver >>> ifself >>> - with this patch, ioctl direction is also handled by core. >>> >>> Here core ensures we only copy in/out as much data as the kernel >>> implementation can handle. >>> >>> >>> Let's consider the following real world example - msm and virtio_gpu. >>> >>> An in field of an _IOW ioctl becomes in/out aka _IORW ioctl. >>> - we add a flag to annotate/request the out, as always invalid flags >>> are return -EINVAL >>> - we change the ioctl encoding >>> >>> As currently handled by core DRM, old kernel/new userspace and >>> vice-versa works just fine. Sadly, vmwgfx will error out, while it >>> could >>> be avoided. >> IMO basically we have a tradeoff between strict checking in this case, >> and new user-space vs old kernel "hazzle-free" transition in the >> relaxed case. >> > Precisely. If I read the code correctly, ATM new userspace will fail > against old kernels. Unless userspace writes two versions of the ioctl - > with with each encoding. > >>> As said above, I'll gladly adjust core and/or others, if this relaxed >>> approach causes an issue somewhere. A specific use-case, real or >>> hypothetical will be appreciated. >> To me there are two important reasons to keep the strict approach. >> >> 1) Avoid user-space mistakes early in the development cycle. We can't >> distinguish between buggy user-space and "new" user-space. This is >> important because of [a]) below. >> > Can you provide a concrete example, please? OK, let's say you were developing fence wait functionality. Like vmw_fence_obj_wait ioctl. Then suddenly you started to wonder why the wait never timed out as it should. The reason turn out to be that signals were restarting the waits with the original timeout. So you change the ioctl from W to RW and add a kernel-computed time to the argument. Everything is fine, except that you forget to change this in a user-space application somewhere. So now what happens, is that that user-space bug can live on undetected as in 1), and that means you can never go back and implement a stricter check because that would completely break old user-space. The current code will trap (and has historically trapped) code like this. That's mainly why I'm reluctant to give it up, but I guess it can be conditionally compiled in for debug purposes. > >> 2) Catch a lot of fuzzer combinations and error out early instead of >> forwarding them to the ioctl function where they may cause harm. >> > Struggling to see why this is a problem? At some point the fuzzer will > get past this first line of defence, so we want to make the rest of the > ioctl is robust. > > >> I think the new user-space vs old kernel can be handled nicely in user- >> space with feature flags or API versions. That's the way we've handled >> them up to now? >> > How is a feature flag doing to help if the encoding changes from _IOW > to _IORW? Ah, you're referring to old user-space new kernel? Yes, I was probably reading a bit too fast. Sorry about that. So we're basically landing in a tradeoff between trapping problems like above, and hazzle-free ioctl argument definition change. OK, so I'm ok with that as long as there is a way we can compile in strict checking, which will likely has to be as a vmwgfx-specific wrapper. /Thomas > > > Thanks > Emil > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
On 2019/05/27, Thomas Hellstrom wrote: > On 5/27/19 2:35 PM, Emil Velikov wrote: > > Hi Thomas, > > > > On 2019/05/27, Thomas Hellstrom wrote: > > > > > > I think we might be talking past each other, let's take a step back: > > > > > > > > - as of previous patch, all of vmwgfx ioctls size is consistently > > > > handled by the core > > > I don't think I follow you here, AFAICT patch 3/5 only affects and > > > relaxes the execbuf checking (and in fact a little more than I would > > > like)? > > > > > Precisely, it makes execbuf ioctl behave like all other ioctls - both > > vmwgfx and rest of drm. > > But we're still enforcing a non-relaxed size check for the other vmwgfx > private ioctls, right? Which is relaxed, together with the directions, in > this commit? > Regardless of the patch, all !execbuf vmwgfx ioctls use the related size checking from core drm. > (Not that it matters much to the discussion, though). > Agreed. > > > > > > - handling of featue flags, as always, is responsibility of the > > > > driver > > > > ifself > > > > - with this patch, ioctl direction is also handled by core. > > > > > > > > Here core ensures we only copy in/out as much data as the kernel > > > > implementation can handle. > > > > > > > > > > > > Let's consider the following real world example - msm and virtio_gpu. > > > > > > > > An in field of an _IOW ioctl becomes in/out aka _IORW ioctl. > > > > - we add a flag to annotate/request the out, as always invalid flags > > > > are return -EINVAL > > > > - we change the ioctl encoding > > > > > > > > As currently handled by core DRM, old kernel/new userspace and > > > > vice-versa works just fine. Sadly, vmwgfx will error out, while it > > > > could > > > > be avoided. > > > IMO basically we have a tradeoff between strict checking in this case, > > > and new user-space vs old kernel "hazzle-free" transition in the > > > relaxed case. > > > > > Precisely. If I read the code correctly, ATM new userspace will fail > > against old kernels. Unless userspace writes two versions of the ioctl - > > with with each encoding. > > > > > > As said above, I'll gladly adjust core and/or others, if this relaxed > > > > approach causes an issue somewhere. A specific use-case, real or > > > > hypothetical will be appreciated. > > > To me there are two important reasons to keep the strict approach. > > > > > > 1) Avoid user-space mistakes early in the development cycle. We can't > > > distinguish between buggy user-space and "new" user-space. This is > > > important because of [a]) below. > > > > > Can you provide a concrete example, please? > > OK, let's say you were developing fence wait functionality. Like > vmw_fence_obj_wait ioctl. Then suddenly you started to wonder why the wait > never timed out as it should. The reason turn out to be that signals were > restarting the waits with the original timeout. So you change the ioctl from > W to RW and add a kernel-computed time to the argument. Everything is fine, > except that you forget to change this in a user-space application somewhere. > > So now what happens, is that that user-space bug can live on undetected as > in 1), and that means you can never go back and implement a stricter check > because that would completely break old user-space. > If I understand you correctly, the W -> RW change in unnecessary. Yet the only negative effect that I can see is the copy_to_user() overhead. The copy should be negligible, yet it "feels" silly. Is there anything more serious that I've missed? Having a closer look - vmwgfx (et al) seems to stand out, such that it does not provide a UABI define including the encoding. Hence it sort of duplicates that in userspace, by using the explicit drmCommand* Guess I could follow the suggestion in vmwgfx_drv.c move the defines to UABI, sync header and update mesa/xf86-video-vmwgfx. What do you think - yes, or please don't? > The current code will trap (and has historically trapped) code like this. > That's mainly why I'm reluctant to give it up, but I guess it can be > conditionally compiled in for debug purposes. > This piece here, is the holly grail. I'll go further and suggest: - add a strict encoding and size check, behind a config toggle - make it a core drm thing and drop the custom vmwgfx one Will keep it disabled by default - but will clearly document Kconfig and docs that devs should toggle it to catch bugs. > > > > > 2) Catch a lot of fuzzer combinations and error out early instead of > > > forwarding them to the ioctl function where they may cause harm. > > > > > Struggling to see why this is a problem? At some point the fuzzer will > > get past this first line of defence, so we want to make the rest of the > > ioctl is robust. > > > > > > > I think the new user-space vs old kernel can be handled nicely in user- > > > space with feature flags or API versions. That's the way we've handled > > > them up to now? > > > > > How is a feature flag doing to help if the encoding changes from _IOW > > to _IORW? > > Ah, you're referring to old user-space new kernel? Yes, I was probably > reading a bit too fast. Sorry about that. > > So we're basically landing in a tradeoff between trapping problems like > above, and hazzle-free ioctl argument definition change. > > OK, so I'm ok with that as long as there is a way we can compile in strict > checking, which will likely has to be as a vmwgfx-specific wrapper. > Ack, I'll proceed with the debug toggle suggestion. Thank you for the insightful input. Emil
On 5/27/19 5:27 PM, Emil Velikov wrote: > On 2019/05/27, Thomas Hellstrom wrote: >> On 5/27/19 2:35 PM, Emil Velikov wrote: >>> Hi Thomas, >>> >>> On 2019/05/27, Thomas Hellstrom wrote: >>> >>>>> I think we might be talking past each other, let's take a step back: >>>>> >>>>> - as of previous patch, all of vmwgfx ioctls size is consistently >>>>> handled by the core >>>> I don't think I follow you here, AFAICT patch 3/5 only affects and >>>> relaxes the execbuf checking (and in fact a little more than I would >>>> like)? >>>> >>> Precisely, it makes execbuf ioctl behave like all other ioctls - both >>> vmwgfx and rest of drm. >> But we're still enforcing a non-relaxed size check for the other vmwgfx >> private ioctls, right? Which is relaxed, together with the directions, in >> this commit? >> > Regardless of the patch, all !execbuf vmwgfx ioctls use the related size > checking from core drm. Well it does, but since we (before this patch) enforce ioctl->cmd == cmd, we also enforce _IOC_SIZE(ioctl->cmd) == _IOC_SIZE(cmd), which makes the core check pointless, or am I missing something? > >> (Not that it matters much to the discussion, though). >> > Agreed. > >>> ... >>> Can you provide a concrete example, please? >> OK, let's say you were developing fence wait functionality. Like >> vmw_fence_obj_wait ioctl. Then suddenly you started to wonder why the wait >> never timed out as it should. The reason turn out to be that signals were >> restarting the waits with the original timeout. So you change the ioctl from >> W to RW and add a kernel-computed time to the argument. Everything is fine, >> except that you forget to change this in a user-space application somewhere. >> >> So now what happens, is that that user-space bug can live on undetected as >> in 1), and that means you can never go back and implement a stricter check >> because that would completely break old user-space. >> > If I understand you correctly, the W -> RW change in unnecessary. Yet > the only negative effect that I can see is the copy_to_user() overhead. > > The copy should be negligible, yet it "feels" silly. > > Is there anything more serious that I've missed? Well the point is in this case, that the write was necessary, but the code would work sort of OK anyway. It updated a kernel "cookie" to make sure the timeout would be correct even with the next call repetition. Now if an old header was floating around, there might be clients using it. And with the current core checks that typically wouldn't get noticed. With the check we'd immediately notice and abort. It feels a little like moving from ANSI C to K&R... :-) > > > Having a closer look - vmwgfx (et al) seems to stand out, such that it > does not provide a UABI define including the encoding. Hence it sort of > duplicates that in userspace, by using the explicit drmCommand* > > Guess I could follow the suggestion in vmwgfx_drv.c move the defines to > UABI, sync header and update mesa/xf86-video-vmwgfx. > > What do you think - yes, or please don't? Please hold on for a while, and I'll discuss it internally. > >> The current code will trap (and has historically trapped) code like this. >> That's mainly why I'm reluctant to give it up, but I guess it can be >> conditionally compiled in for debug purposes. >> > This piece here, is the holly grail. I'll go further and suggest: > > - add a strict encoding and size check, behind a config toggle > - make it a core drm thing and drop the custom vmwgfx one > > Will keep it disabled by default - but will clearly document Kconfig and > docs that devs should toggle it to catch bugs. Sounds good, but IIRC the reason why I kept it only to driver-private ioctls, is that there were errors with the drm ioctls. But that was a long time ago so I might remember incorrectly, or user-space has been fixed. > >>>> 2) Catch a lot of fuzzer combinations and error out early instead of >>>> forwarding them to the ioctl function where they may cause harm. >>>> >>> Struggling to see why this is a problem? At some point the fuzzer will >>> get past this first line of defence, so we want to make the rest of the >>> ioctl is robust. >>> >>> >>>> I think the new user-space vs old kernel can be handled nicely in user- >>>> space with feature flags or API versions. That's the way we've handled >>>> them up to now? >>>> >>> How is a feature flag doing to help if the encoding changes from _IOW >>> to _IORW? >> Ah, you're referring to old user-space new kernel? Yes, I was probably >> reading a bit too fast. Sorry about that. >> >> So we're basically landing in a tradeoff between trapping problems like >> above, and hazzle-free ioctl argument definition change. >> >> OK, so I'm ok with that as long as there is a way we can compile in strict >> checking, which will likely has to be as a vmwgfx-specific wrapper. >> > Ack, I'll proceed with the debug toggle suggestion. Great. > > Thank you for the insightful input. > Emil Thanks, Thomas
On 2019/05/27, Thomas Hellstrom wrote: > On 5/27/19 5:27 PM, Emil Velikov wrote: > > On 2019/05/27, Thomas Hellstrom wrote: > > > On 5/27/19 2:35 PM, Emil Velikov wrote: > > > > Hi Thomas, > > > > > > > > On 2019/05/27, Thomas Hellstrom wrote: > > > > > > > > > > I think we might be talking past each other, let's take a step back: > > > > > > > > > > > > - as of previous patch, all of vmwgfx ioctls size is consistently > > > > > > handled by the core > > > > > I don't think I follow you here, AFAICT patch 3/5 only affects and > > > > > relaxes the execbuf checking (and in fact a little more than I would > > > > > like)? > > > > > > > > > Precisely, it makes execbuf ioctl behave like all other ioctls - both > > > > vmwgfx and rest of drm. > > > But we're still enforcing a non-relaxed size check for the other vmwgfx > > > private ioctls, right? Which is relaxed, together with the directions, in > > > this commit? > > > > > Regardless of the patch, all !execbuf vmwgfx ioctls use the related size > > checking from core drm. > > Well it does, but since we (before this patch) enforce ioctl->cmd == cmd, we > also enforce > _IOC_SIZE(ioctl->cmd) == _IOC_SIZE(cmd), which makes the core check > pointless, or am I missing something? > You're spot on - I never looked at the _IOC_SIZE declaration. I was assuming some other magic. > > > > > (Not that it matters much to the discussion, though). > > > > > Agreed. > > > > > > > ... > > > > Can you provide a concrete example, please? > > > OK, let's say you were developing fence wait functionality. Like > > > vmw_fence_obj_wait ioctl. Then suddenly you started to wonder why the wait > > > never timed out as it should. The reason turn out to be that signals were > > > restarting the waits with the original timeout. So you change the ioctl from > > > W to RW and add a kernel-computed time to the argument. Everything is fine, > > > except that you forget to change this in a user-space application somewhere. > > > > > > So now what happens, is that that user-space bug can live on undetected as > > > in 1), and that means you can never go back and implement a stricter check > > > because that would completely break old user-space. > > > > > If I understand you correctly, the W -> RW change in unnecessary. Yet > > the only negative effect that I can see is the copy_to_user() overhead. > > > > The copy should be negligible, yet it "feels" silly. > > > > Is there anything more serious that I've missed? > > Well the point is in this case, that the write was necessary, but the code > would work sort of OK anyway. It updated a kernel "cookie" to make sure the > timeout would be correct even with the next call repetition. Now if an old > header was floating around, there might be clients using it. And with the > current core checks that typically wouldn't get noticed. With the check we'd > immediately notice and abort. It feels a little like moving from ANSI C to > K&R... :-) > Technically, the kernel (or any function really) should write output arguments only when needed. Agree though - it's sometimes annoying or inconvenient. For the ANSI C to K&R comment - sure, only if one cares about backward compat. If they don't - flip the config toggle and carry on ;-) > > > > > > Having a closer look - vmwgfx (et al) seems to stand out, such that it > > does not provide a UABI define including the encoding. Hence it sort of > > duplicates that in userspace, by using the explicit drmCommand* > > > > Guess I could follow the suggestion in vmwgfx_drv.c move the defines to > > UABI, sync header and update mesa/xf86-video-vmwgfx. > > > > What do you think - yes, or please don't? > > Please hold on for a while, and I'll discuss it internally. > Ack. > > > > > The current code will trap (and has historically trapped) code like this. > > > That's mainly why I'm reluctant to give it up, but I guess it can be > > > conditionally compiled in for debug purposes. > > > > > This piece here, is the holly grail. I'll go further and suggest: > > > > - add a strict encoding and size check, behind a config toggle > > - make it a core drm thing and drop the custom vmwgfx one > > > > Will keep it disabled by default - but will clearly document Kconfig and > > docs that devs should toggle it to catch bugs. > > Sounds good, but IIRC the reason why I kept it only to driver-private > ioctls, is that there were errors with the drm ioctls. But that was a long > time ago so I might remember incorrectly, or user-space has been fixed. > Good point - will have a quick look and send patches if needed. Thanks Emil
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c index 2cb6ae219e43..f65542639b55 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c @@ -1147,9 +1147,6 @@ static long vmw_generic_ioctl(struct file *filp, unsigned int cmd, return -EACCES; } - if (unlikely(ioctl->cmd != cmd)) - goto out_io_encoding; - flags = ioctl->flags; } else if (!drm_ioctl_flags(nr, &flags)) return -EINVAL; @@ -1169,12 +1166,6 @@ static long vmw_generic_ioctl(struct file *filp, unsigned int cmd, ttm_read_unlock(&vmaster->lock); return ret; - -out_io_encoding: - DRM_ERROR("Invalid command format, ioctl %d\n", - nr - DRM_COMMAND_BASE); - - return -EINVAL; } static long vmw_unlocked_ioctl(struct file *filp, unsigned int cmd,