Message ID | 167044909523.3885870.619291306425395938.stgit@omen (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vfio/type1: Cleanup remaining vaddr removal/update fragments | expand |
On Wed, Dec 07, 2022 at 02:45:18PM -0700, Alex Williamson wrote: > Fix several loose ends relative to reverting support for vaddr removal > and update. Mark feature and ioctl flags as deprecated, restore local > variable scope in pin pages, remove remaining support in the mapping > code. > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > --- Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Jason
> From: Alex Williamson <alex.williamson@redhat.com> > Sent: Thursday, December 8, 2022 5:45 AM > > Fix several loose ends relative to reverting support for vaddr removal > and update. Mark feature and ioctl flags as deprecated, restore local > variable scope in pin pages, remove remaining support in the mapping > code. > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > --- > > This applies on top of Steve's patch[1] to fully remove and deprecate > this feature in the short term, following the same methodology we used > for the v1 migration interface removal. The intention would be to pick > Steve's patch and this follow-on for v6.2 given that existing support > exposes vulnerabilities and no known upstream userspaces make use of > this feature. > > [1]https://lore.kernel.org/all/1670363753-249738-2-git-send-email- > steven.sistare@oracle.com/ > Reviewed-by: Kevin Tian <kevin.tian@intel.com> btw given the exposure and no known upstream usage should this be also pushed to stable kernels?
On Thu, 8 Dec 2022 07:56:30 +0000 "Tian, Kevin" <kevin.tian@intel.com> wrote: > > From: Alex Williamson <alex.williamson@redhat.com> > > Sent: Thursday, December 8, 2022 5:45 AM > > > > Fix several loose ends relative to reverting support for vaddr removal > > and update. Mark feature and ioctl flags as deprecated, restore local > > variable scope in pin pages, remove remaining support in the mapping > > code. > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > --- > > > > This applies on top of Steve's patch[1] to fully remove and deprecate > > this feature in the short term, following the same methodology we used > > for the v1 migration interface removal. The intention would be to pick > > Steve's patch and this follow-on for v6.2 given that existing support > > exposes vulnerabilities and no known upstream userspaces make use of > > this feature. > > > > [1]https://lore.kernel.org/all/1670363753-249738-2-git-send-email- > > steven.sistare@oracle.com/ > > > > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > > btw given the exposure and no known upstream usage should this be > also pushed to stable kernels? I'll add to both: Cc: stable@vger.kernel.org # v5.12+ Thanks, Alex
On 12/8/2022 11:40 AM, Alex Williamson wrote: > On Thu, 8 Dec 2022 07:56:30 +0000 > "Tian, Kevin" <kevin.tian@intel.com> wrote: > >>> From: Alex Williamson <alex.williamson@redhat.com> >>> Sent: Thursday, December 8, 2022 5:45 AM >>> >>> Fix several loose ends relative to reverting support for vaddr removal >>> and update. Mark feature and ioctl flags as deprecated, restore local >>> variable scope in pin pages, remove remaining support in the mapping >>> code. >>> >>> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> >>> --- >>> >>> This applies on top of Steve's patch[1] to fully remove and deprecate >>> this feature in the short term, following the same methodology we used >>> for the v1 migration interface removal. The intention would be to pick >>> Steve's patch and this follow-on for v6.2 given that existing support >>> exposes vulnerabilities and no known upstream userspaces make use of >>> this feature. >>> >>> [1]https://lore.kernel.org/all/1670363753-249738-2-git-send-email- >>> steven.sistare@oracle.com/ >>> >> >> Reviewed-by: Kevin Tian <kevin.tian@intel.com> >> >> btw given the exposure and no known upstream usage should this be >> also pushed to stable kernels? > > I'll add to both: > > Cc: stable@vger.kernel.org # v5.12+ We maintain and use a version of qemu that contains the live update patches, and requires these kernel interfaces. Other companies are also experimenting with it. Please do not remove it from stable. - Steve
On Fri, 9 Dec 2022 13:40:29 -0500 Steven Sistare <steven.sistare@oracle.com> wrote: > On 12/8/2022 11:40 AM, Alex Williamson wrote: > > On Thu, 8 Dec 2022 07:56:30 +0000 > > "Tian, Kevin" <kevin.tian@intel.com> wrote: > > > >>> From: Alex Williamson <alex.williamson@redhat.com> > >>> Sent: Thursday, December 8, 2022 5:45 AM > >>> > >>> Fix several loose ends relative to reverting support for vaddr removal > >>> and update. Mark feature and ioctl flags as deprecated, restore local > >>> variable scope in pin pages, remove remaining support in the mapping > >>> code. > >>> > >>> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > >>> --- > >>> > >>> This applies on top of Steve's patch[1] to fully remove and deprecate > >>> this feature in the short term, following the same methodology we used > >>> for the v1 migration interface removal. The intention would be to pick > >>> Steve's patch and this follow-on for v6.2 given that existing support > >>> exposes vulnerabilities and no known upstream userspaces make use of > >>> this feature. > >>> > >>> [1]https://lore.kernel.org/all/1670363753-249738-2-git-send-email- > >>> steven.sistare@oracle.com/ > >>> > >> > >> Reviewed-by: Kevin Tian <kevin.tian@intel.com> > >> > >> btw given the exposure and no known upstream usage should this be > >> also pushed to stable kernels? > > > > I'll add to both: > > > > Cc: stable@vger.kernel.org # v5.12+ > > We maintain and use a version of qemu that contains the live update patches, > and requires these kernel interfaces. Other companies are also experimenting > with it. Please do not remove it from stable. The interface has been determined to have vulnerabilities and the proposal to resolve those vulnerabilities is to implement a new API. If we think it's worthwhile to remove the existing, vulnerable interface in the current kernel, what makes it safe to keep it for stable kernels? Existing users that could choose not to accept the revert in their downstream kernel and allowing users evaluating the interface more time before they know it's been removed upstream, are not terribly compelling reasons to keep it in upstream stable kernels. Thanks, Alex
On 12/9/2022 2:42 PM, Alex Williamson wrote: > On Fri, 9 Dec 2022 13:40:29 -0500 > Steven Sistare <steven.sistare@oracle.com> wrote: > >> On 12/8/2022 11:40 AM, Alex Williamson wrote: >>> On Thu, 8 Dec 2022 07:56:30 +0000 >>> "Tian, Kevin" <kevin.tian@intel.com> wrote: >>> >>>>> From: Alex Williamson <alex.williamson@redhat.com> >>>>> Sent: Thursday, December 8, 2022 5:45 AM >>>>> >>>>> Fix several loose ends relative to reverting support for vaddr removal >>>>> and update. Mark feature and ioctl flags as deprecated, restore local >>>>> variable scope in pin pages, remove remaining support in the mapping >>>>> code. >>>>> >>>>> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> >>>>> --- >>>>> >>>>> This applies on top of Steve's patch[1] to fully remove and deprecate >>>>> this feature in the short term, following the same methodology we used >>>>> for the v1 migration interface removal. The intention would be to pick >>>>> Steve's patch and this follow-on for v6.2 given that existing support >>>>> exposes vulnerabilities and no known upstream userspaces make use of >>>>> this feature. >>>>> >>>>> [1]https://lore.kernel.org/all/1670363753-249738-2-git-send-email- >>>>> steven.sistare@oracle.com/ >>>>> >>>> >>>> Reviewed-by: Kevin Tian <kevin.tian@intel.com> >>>> >>>> btw given the exposure and no known upstream usage should this be >>>> also pushed to stable kernels? >>> >>> I'll add to both: >>> >>> Cc: stable@vger.kernel.org # v5.12+ >> >> We maintain and use a version of qemu that contains the live update patches, >> and requires these kernel interfaces. Other companies are also experimenting >> with it. Please do not remove it from stable. > > The interface has been determined to have vulnerabilities and the > proposal to resolve those vulnerabilities is to implement a new API. > If we think it's worthwhile to remove the existing, vulnerable interface > in the current kernel, what makes it safe to keep it for stable kernels? I do not think it's worth while, but I have stopped fighting for 6.2. > Existing users that could choose not to accept the revert in their > downstream kernel and allowing users evaluating the interface more time > before they know it's been removed upstream, are not terribly > compelling reasons to keep it in upstream stable kernels. Thanks, The compelling reason is that stable is supposed to be stable and maintain existing interfaces, and now I will need to re-merge the interfaces at regular intervals when we update UEK from stable. Oracle is a current user of these interfaces in our business. Do we count? - Steve
On Fri, 9 Dec 2022 14:52:49 -0500 Steven Sistare <steven.sistare@oracle.com> wrote: > On 12/9/2022 2:42 PM, Alex Williamson wrote: > > On Fri, 9 Dec 2022 13:40:29 -0500 > > Steven Sistare <steven.sistare@oracle.com> wrote: > > > >> On 12/8/2022 11:40 AM, Alex Williamson wrote: > >>> On Thu, 8 Dec 2022 07:56:30 +0000 > >>> "Tian, Kevin" <kevin.tian@intel.com> wrote: > >>> > >>>>> From: Alex Williamson <alex.williamson@redhat.com> > >>>>> Sent: Thursday, December 8, 2022 5:45 AM > >>>>> > >>>>> Fix several loose ends relative to reverting support for vaddr removal > >>>>> and update. Mark feature and ioctl flags as deprecated, restore local > >>>>> variable scope in pin pages, remove remaining support in the mapping > >>>>> code. > >>>>> > >>>>> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > >>>>> --- > >>>>> > >>>>> This applies on top of Steve's patch[1] to fully remove and deprecate > >>>>> this feature in the short term, following the same methodology we used > >>>>> for the v1 migration interface removal. The intention would be to pick > >>>>> Steve's patch and this follow-on for v6.2 given that existing support > >>>>> exposes vulnerabilities and no known upstream userspaces make use of > >>>>> this feature. > >>>>> > >>>>> [1]https://lore.kernel.org/all/1670363753-249738-2-git-send-email- > >>>>> steven.sistare@oracle.com/ > >>>>> > >>>> > >>>> Reviewed-by: Kevin Tian <kevin.tian@intel.com> > >>>> > >>>> btw given the exposure and no known upstream usage should this be > >>>> also pushed to stable kernels? > >>> > >>> I'll add to both: > >>> > >>> Cc: stable@vger.kernel.org # v5.12+ > >> > >> We maintain and use a version of qemu that contains the live update patches, > >> and requires these kernel interfaces. Other companies are also experimenting > >> with it. Please do not remove it from stable. > > > > The interface has been determined to have vulnerabilities and the > > proposal to resolve those vulnerabilities is to implement a new API. > > If we think it's worthwhile to remove the existing, vulnerable interface > > in the current kernel, what makes it safe to keep it for stable kernels? > > I do not think it's worth while, but I have stopped fighting for 6.2. > > > Existing users that could choose not to accept the revert in their > > downstream kernel and allowing users evaluating the interface more time > > before they know it's been removed upstream, are not terribly > > compelling reasons to keep it in upstream stable kernels. Thanks, > > The compelling reason is that stable is supposed to be stable and maintain > existing interfaces, and now I will need to re-merge the interfaces at > regular intervals when we update UEK from stable. Oracle is a current user > of these interfaces in our business. Do we count? These are the rules for stable from Documentation/process/stable-kernel-rules.rst: - It must be obviously correct and tested. (check) - It cannot be bigger than 100 lines, with context. (We're pushing this a bit, but we could certainly disable w/o removing the interface in far fewer lines. We're close enough that I think a direct backport is preferable) - It must fix only one thing. (check) - It must fix a real bug that bothers people (not a, "This could be a problem..." type thing). (This is a point where you could present an objection) - It must fix a problem that causes a build error (but not for things marked CONFIG_BROKEN), an oops, a hang, data corruption, a real security issue, or some "oh, that's not good" issue. In short, something critical. (This as well) - Serious issues as reported by a user of a distribution kernel may also be considered if they fix a notable performance or interactivity issue. As these fixes are not as obvious and have a higher risk of a subtle regression they should only be submitted by a distribution kernel maintainer and include an addendum linking to a bugzilla entry if it exists and additional information on the user-visible impact. (N/A, but note the mention of a user visible impact) - New device IDs and quirks are also accepted. (N/A) - No "theoretical race condition" issues, unless an explanation of how the race can be exploited is also provided. (AIUI, the vulnerabilities here may not have exploits, but they are real) - It cannot contain any "trivial" fixes in it (spelling changes, whitespace cleanups, etc). (N/A) - It must follow the :ref:`Documentation/process/submitting-patches.rst <submittingpatches>` rules. (Of course) - It or an equivalent fix must already exist in Linus' tree (upstream). This last bullet is really the crux of what brings us to this point, if you're not willing to defend the vulnerabilities to maintain the interface in the mainline kernel, why should the upstream community maintain them in the stable kernels? The question is not about who is using the interface, it's the fact that the resolution to the existing vulnerabilities is to remove the interface and nobody is making a case around the validity or exploit-ability of those vulnerabilities to carry along the interface in the interim. If the revert does go into mainline, but were to skip stable, that only delays your re-merging burden briefly, while continuing to expose stable kernels to the vulnerabilities, and risks further users adopting an interface that no longer exists. Thanks, Alex
On 12/9/2022 4:01 PM, Alex Williamson wrote: > On Fri, 9 Dec 2022 14:52:49 -0500 > Steven Sistare <steven.sistare@oracle.com> wrote: >> On 12/9/2022 2:42 PM, Alex Williamson wrote: >>> On Fri, 9 Dec 2022 13:40:29 -0500 >>> Steven Sistare <steven.sistare@oracle.com> wrote: >>> >>>> On 12/8/2022 11:40 AM, Alex Williamson wrote: >>>>> On Thu, 8 Dec 2022 07:56:30 +0000 >>>>> "Tian, Kevin" <kevin.tian@intel.com> wrote: >>>>> >>>>>>> From: Alex Williamson <alex.williamson@redhat.com> >>>>>>> Sent: Thursday, December 8, 2022 5:45 AM >>>>>>> >>>>>>> Fix several loose ends relative to reverting support for vaddr removal >>>>>>> and update. Mark feature and ioctl flags as deprecated, restore local >>>>>>> variable scope in pin pages, remove remaining support in the mapping >>>>>>> code. >>>>>>> >>>>>>> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> >>>>>>> --- >>>>>>> >>>>>>> This applies on top of Steve's patch[1] to fully remove and deprecate >>>>>>> this feature in the short term, following the same methodology we used >>>>>>> for the v1 migration interface removal. The intention would be to pick >>>>>>> Steve's patch and this follow-on for v6.2 given that existing support >>>>>>> exposes vulnerabilities and no known upstream userspaces make use of >>>>>>> this feature. >>>>>>> >>>>>>> [1]https://lore.kernel.org/all/1670363753-249738-2-git-send-email- >>>>>>> steven.sistare@oracle.com/ >>>>>>> >>>>>> >>>>>> Reviewed-by: Kevin Tian <kevin.tian@intel.com> >>>>>> >>>>>> btw given the exposure and no known upstream usage should this be >>>>>> also pushed to stable kernels? >>>>> >>>>> I'll add to both: >>>>> >>>>> Cc: stable@vger.kernel.org # v5.12+ >>>> >>>> We maintain and use a version of qemu that contains the live update patches, >>>> and requires these kernel interfaces. Other companies are also experimenting >>>> with it. Please do not remove it from stable. >>> >>> The interface has been determined to have vulnerabilities and the >>> proposal to resolve those vulnerabilities is to implement a new API. >>> If we think it's worthwhile to remove the existing, vulnerable interface >>> in the current kernel, what makes it safe to keep it for stable kernels? >> >> I do not think it's worth while, but I have stopped fighting for 6.2. >> >>> Existing users that could choose not to accept the revert in their >>> downstream kernel and allowing users evaluating the interface more time >>> before they know it's been removed upstream, are not terribly >>> compelling reasons to keep it in upstream stable kernels. Thanks, >> >> The compelling reason is that stable is supposed to be stable and maintain >> existing interfaces, and now I will need to re-merge the interfaces at >> regular intervals when we update UEK from stable. Oracle is a current user >> of these interfaces in our business. Do we count? > > These are the rules for stable from > Documentation/process/stable-kernel-rules.rst: > > - It must be obviously correct and tested. > > (check) > > - It cannot be bigger than 100 lines, with context. > > (We're pushing this a bit, but we could certainly disable w/o removing > the interface in far fewer lines. We're close enough that I think a > direct backport is preferable) > > - It must fix only one thing. > > (check) > > - It must fix a real bug that bothers people (not a, "This could be a > problem..." type thing). > > (This is a point where you could present an objection) > > - It must fix a problem that causes a build error (but not for things > marked CONFIG_BROKEN), an oops, a hang, data corruption, a real > security issue, or some "oh, that's not good" issue. In short, something > critical. > > (This as well) > > - Serious issues as reported by a user of a distribution kernel may also > be considered if they fix a notable performance or interactivity issue. > As these fixes are not as obvious and have a higher risk of a subtle > regression they should only be submitted by a distribution kernel > maintainer and include an addendum linking to a bugzilla entry if it > exists and additional information on the user-visible impact. > > (N/A, but note the mention of a user visible impact) > > - New device IDs and quirks are also accepted. > > (N/A) > > - No "theoretical race condition" issues, unless an explanation of how the > race can be exploited is also provided. > > (AIUI, the vulnerabilities here may not have exploits, but they are real) > > - It cannot contain any "trivial" fixes in it (spelling changes, > whitespace cleanups, etc). > > (N/A) > > - It must follow the > :ref:`Documentation/process/submitting-patches.rst <submittingpatches>` > rules. > > (Of course) > > - It or an equivalent fix must already exist in Linus' tree (upstream). > > This last bullet is really the crux of what brings us to this point, if > you're not willing to defend the vulnerabilities to maintain the > interface in the mainline kernel, why should the upstream community > maintain them in the stable kernels? > > The question is not about who is using the interface, it's the fact that > the resolution to the existing vulnerabilities is to remove the > interface and nobody is making a case around the validity or > exploit-ability of those vulnerabilities to carry along the interface > in the interim. > > If the revert does go into mainline, but were to skip stable, that only > delays your re-merging burden briefly, while continuing to expose stable > kernels to the vulnerabilities, and risks further users adopting an > interface that no longer exists. Thanks, Thank you for your thoughtful response. Rather than debate the degree of of vulnerability, I propose an alternate solution. The technical crux of the matter is support for mediated devices. So, let's exclude them when these legacy interfaces are used, and allow them for native iommufd. The fix is small and simple: if there is no iommu capable domain in the container, then return false for the VFIO_UPDATE_VADDR extension. And to prevent locked_mm underflow, add to the new mm's locked_vm in VFIO_DMA_MAP_FLAG_VADDR. Two small patches, which I can submit on monday, for 6.x and stable. I can also submit a patch for iommufd to use these interfaces with no mdev in vfio compat mode. I am still committed to new interfaces for native iommufd, and am making good progress with Jason's patch. - Steve
On Sat, Dec 10, 2022 at 09:14:06AM -0500, Steven Sistare wrote: > Thank you for your thoughtful response. Rather than debate the degree of > of vulnerability, I propose an alternate solution. The technical crux of > the matter is support for mediated devices. I'm not sure I'm convinced about that. It is easy to make problematic situations with mdevs, but that doesn't mean other cases don't exist too eg what happens if userspace suspends and then immediately does something to trigger a domain attachment? Doesn't it still deadlock the kernel? Honestly, I'm not sure I see the big deal, just don't backport these reverts to your disto kernel. Jason
On 12/12/2022 8:17 AM, Jason Gunthorpe wrote: > On Sat, Dec 10, 2022 at 09:14:06AM -0500, Steven Sistare wrote: > >> Thank you for your thoughtful response. Rather than debate the degree of >> of vulnerability, I propose an alternate solution. The technical crux of >> the matter is support for mediated devices. > > I'm not sure I'm convinced about that. It is easy to make problematic > situations with mdevs, but that doesn't mean other cases don't exist > too eg what happens if userspace suspends and then immediately does > something to trigger a domain attachment? Doesn't it still deadlock > the kernel? No deadlock. Any ioctl's that need vaddr return EINVAL if the vaddr has been suspended. ioctl's that do not need it succeed. The vaddr is not needed when all pages have been pinned, because iova can be translated via the iommu. > Honestly, I'm not sure I see the big deal, just don't backport these > reverts to your disto kernel. It must be done every time the kernel is refreshed, and is disruptive and error prone. All exceptions to the normal process are. And it derails my qemu patch review until native iommufd support is pushed into qemu. If I can avoid those problems with a few simple fixes, then that is a win. - Steve
On Mon, 12 Dec 2022 09:17:54 -0400 Jason Gunthorpe <jgg@ziepe.ca> wrote: > On Sat, Dec 10, 2022 at 09:14:06AM -0500, Steven Sistare wrote: > > > Thank you for your thoughtful response. Rather than debate the degree of > > of vulnerability, I propose an alternate solution. The technical crux of > > the matter is support for mediated devices. > > I'm not sure I'm convinced about that. It is easy to make problematic > situations with mdevs, but that doesn't mean other cases don't exist > too eg what happens if userspace suspends and then immediately does > something to trigger a domain attachment? Doesn't it still deadlock > the kernel? The opportunity for that to deadlock isn't obvious to me, a replay would be stalled waiting for invalid vaddrs, but this is essentially the user deadlocking themselves. There's also code there to handle the process getting killed while waiting, making it interruptible. Thanks, Alex
On 12/12/2022 10:58 AM, Alex Williamson wrote: > On Mon, 12 Dec 2022 09:17:54 -0400 > Jason Gunthorpe <jgg@ziepe.ca> wrote: > >> On Sat, Dec 10, 2022 at 09:14:06AM -0500, Steven Sistare wrote: >> >>> Thank you for your thoughtful response. Rather than debate the degree of >>> of vulnerability, I propose an alternate solution. The technical crux of >>> the matter is support for mediated devices. >> >> I'm not sure I'm convinced about that. It is easy to make problematic >> situations with mdevs, but that doesn't mean other cases don't exist >> too eg what happens if userspace suspends and then immediately does >> something to trigger a domain attachment? Doesn't it still deadlock >> the kernel? > > The opportunity for that to deadlock isn't obvious to me, a replay > would be stalled waiting for invalid vaddrs, but this is essentially > the user deadlocking themselves. There's also code there to handle the > process getting killed while waiting, making it interruptible. Thanks, I will submit new patches tomorrow to exclude mdevs. Almost done. - Steve
On Mon, 12 Dec 2022 15:59:11 -0500 Steven Sistare <steven.sistare@oracle.com> wrote: > On 12/12/2022 10:58 AM, Alex Williamson wrote: > > On Mon, 12 Dec 2022 09:17:54 -0400 > > Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > >> On Sat, Dec 10, 2022 at 09:14:06AM -0500, Steven Sistare wrote: > >> > >>> Thank you for your thoughtful response. Rather than debate the degree of > >>> of vulnerability, I propose an alternate solution. The technical crux of > >>> the matter is support for mediated devices. > >> > >> I'm not sure I'm convinced about that. It is easy to make problematic > >> situations with mdevs, but that doesn't mean other cases don't exist > >> too eg what happens if userspace suspends and then immediately does > >> something to trigger a domain attachment? Doesn't it still deadlock > >> the kernel? > > > > The opportunity for that to deadlock isn't obvious to me, a replay > > would be stalled waiting for invalid vaddrs, but this is essentially > > the user deadlocking themselves. There's also code there to handle the > > process getting killed while waiting, making it interruptible. Thanks, > > I will submit new patches tomorrow to exclude mdevs. Almost done. I've dropped the removal commits from my next branch in the interim. Thanks, Alex
On Mon, Dec 12, 2022 at 02:26:51PM -0700, Alex Williamson wrote: > On Mon, 12 Dec 2022 15:59:11 -0500 > Steven Sistare <steven.sistare@oracle.com> wrote: > > > On 12/12/2022 10:58 AM, Alex Williamson wrote: > > > On Mon, 12 Dec 2022 09:17:54 -0400 > > > Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > > >> On Sat, Dec 10, 2022 at 09:14:06AM -0500, Steven Sistare wrote: > > >> > > >>> Thank you for your thoughtful response. Rather than debate the degree of > > >>> of vulnerability, I propose an alternate solution. The technical crux of > > >>> the matter is support for mediated devices. > > >> > > >> I'm not sure I'm convinced about that. It is easy to make problematic > > >> situations with mdevs, but that doesn't mean other cases don't exist > > >> too eg what happens if userspace suspends and then immediately does > > >> something to trigger a domain attachment? Doesn't it still deadlock > > >> the kernel? > > > > > > The opportunity for that to deadlock isn't obvious to me, a replay > > > would be stalled waiting for invalid vaddrs, but this is essentially > > > the user deadlocking themselves. There's also code there to handle the > > > process getting killed while waiting, making it interruptible. Thanks, > > > > I will submit new patches tomorrow to exclude mdevs. Almost done. > > I've dropped the removal commits from my next branch in the interim. Woah, please don't do that - I already built and sent pull requests assuming this, there are conflicts. Why would we not revert everything from 6.2 - that is what we agreed to do? Jason
On Mon, 12 Dec 2022 19:08:57 -0400 Jason Gunthorpe <jgg@ziepe.ca> wrote: > On Mon, Dec 12, 2022 at 02:26:51PM -0700, Alex Williamson wrote: > > On Mon, 12 Dec 2022 15:59:11 -0500 > > Steven Sistare <steven.sistare@oracle.com> wrote: > > > > > On 12/12/2022 10:58 AM, Alex Williamson wrote: > > > > On Mon, 12 Dec 2022 09:17:54 -0400 > > > > Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > > > > >> On Sat, Dec 10, 2022 at 09:14:06AM -0500, Steven Sistare wrote: > > > >> > > > >>> Thank you for your thoughtful response. Rather than debate the degree of > > > >>> of vulnerability, I propose an alternate solution. The technical crux of > > > >>> the matter is support for mediated devices. > > > >> > > > >> I'm not sure I'm convinced about that. It is easy to make problematic > > > >> situations with mdevs, but that doesn't mean other cases don't exist > > > >> too eg what happens if userspace suspends and then immediately does > > > >> something to trigger a domain attachment? Doesn't it still deadlock > > > >> the kernel? > > > > > > > > The opportunity for that to deadlock isn't obvious to me, a replay > > > > would be stalled waiting for invalid vaddrs, but this is essentially > > > > the user deadlocking themselves. There's also code there to handle the > > > > process getting killed while waiting, making it interruptible. Thanks, > > > > > > I will submit new patches tomorrow to exclude mdevs. Almost done. > > > > I've dropped the removal commits from my next branch in the interim. > > Woah, please don't do that - I already built and sent pull requests > assuming this, there are conflicts. I've done merges both ways with your iommufd pull request and don't see any conflicts relative to these changes. Kconfig, Makefile, and vfio_main.c related to virq integration and group extraction are the only conflicts. Besides, it's already pushed and I don't have any references to the old head, so someone would need to provide it if we wanted to keep the old hashes. > Why would we not revert everything from 6.2 - that is what we agreed > to do? The decision to revert was based on the current interface being buggy, abandoned, and re-implemented. It doesn't seem that there's much future for the current interface, but Steve has stepped up to restrict the current implementation to non-mdev devices, which resolves your concern regarding unlimited user blocking of kernel threads afaict, and we'll see what he does with locked memory. If it looks ok, then I think it reduces our urgency to remove it, and in particular, I think negates our need to remove it from stable when we eventually do so anyway. Thanks, Alex
On Mon, Dec 12, 2022 at 04:29:48PM -0700, Alex Williamson wrote: > On Mon, 12 Dec 2022 19:08:57 -0400 > Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > On Mon, Dec 12, 2022 at 02:26:51PM -0700, Alex Williamson wrote: > > > On Mon, 12 Dec 2022 15:59:11 -0500 > > > Steven Sistare <steven.sistare@oracle.com> wrote: > > > > > > > On 12/12/2022 10:58 AM, Alex Williamson wrote: > > > > > On Mon, 12 Dec 2022 09:17:54 -0400 > > > > > Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > > > > > > >> On Sat, Dec 10, 2022 at 09:14:06AM -0500, Steven Sistare wrote: > > > > >> > > > > >>> Thank you for your thoughtful response. Rather than debate the degree of > > > > >>> of vulnerability, I propose an alternate solution. The technical crux of > > > > >>> the matter is support for mediated devices. > > > > >> > > > > >> I'm not sure I'm convinced about that. It is easy to make problematic > > > > >> situations with mdevs, but that doesn't mean other cases don't exist > > > > >> too eg what happens if userspace suspends and then immediately does > > > > >> something to trigger a domain attachment? Doesn't it still deadlock > > > > >> the kernel? > > > > > > > > > > The opportunity for that to deadlock isn't obvious to me, a replay > > > > > would be stalled waiting for invalid vaddrs, but this is essentially > > > > > the user deadlocking themselves. There's also code there to handle the > > > > > process getting killed while waiting, making it interruptible. Thanks, > > > > > > > > I will submit new patches tomorrow to exclude mdevs. Almost done. > > > > > > I've dropped the removal commits from my next branch in the interim. > > > > Woah, please don't do that - I already built and sent pull requests > > assuming this, there are conflicts. > > I've done merges both ways with your iommufd pull request and don't see > any conflicts relative to these changes. Kconfig, Makefile, and > vfio_main.c related to virq integration and group extraction are the > only conflicts. I got an extra hunk in the header file > Besides, it's already pushed and I don't have any references to the > old head, so someone would need to provide it if we wanted to keep > the old hashes. https://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git/tag/?h=for-linus-iommufd-merged https://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git/commit/?h=for-linus-iommufd-merged&id=e9a1f0f32d86c05f01878a0448384a46a453abc7 > > Why would we not revert everything from 6.2 - that is what we agreed > > to do? > > The decision to revert was based on the current interface being buggy, > abandoned, and re-implemented. It doesn't seem that there's much future > for the current interface, but Steve has stepped up to restrict the > current implementation to non-mdev devices, which resolves your concern > regarding unlimited user blocking of kernel threads afaict, and we'll > see what he does with locked memory. Except nobody has seen this yet, and it can't go into 6.2 at this point (see Linus's rather harsh remarks on late work for v6.2) So we are punting on this for another kernel cycle. I don't like any of this. Regardless of what happens I need this all settled and the vfio tree unchanging by Thursday to finalize the above tag.. Jason
On Mon, 12 Dec 2022 19:35:40 -0400 Jason Gunthorpe <jgg@ziepe.ca> wrote: > On Mon, Dec 12, 2022 at 04:29:48PM -0700, Alex Williamson wrote: > > On Mon, 12 Dec 2022 19:08:57 -0400 > > Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > > On Mon, Dec 12, 2022 at 02:26:51PM -0700, Alex Williamson wrote: > > > > On Mon, 12 Dec 2022 15:59:11 -0500 > > > > Steven Sistare <steven.sistare@oracle.com> wrote: > > > > > > > > > On 12/12/2022 10:58 AM, Alex Williamson wrote: > > > > > > On Mon, 12 Dec 2022 09:17:54 -0400 > > > > > > Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > > > > > > > > >> On Sat, Dec 10, 2022 at 09:14:06AM -0500, Steven Sistare wrote: > > > > > >> > > > > > >>> Thank you for your thoughtful response. Rather than debate the degree of > > > > > >>> of vulnerability, I propose an alternate solution. The technical crux of > > > > > >>> the matter is support for mediated devices. > > > > > >> > > > > > >> I'm not sure I'm convinced about that. It is easy to make problematic > > > > > >> situations with mdevs, but that doesn't mean other cases don't exist > > > > > >> too eg what happens if userspace suspends and then immediately does > > > > > >> something to trigger a domain attachment? Doesn't it still deadlock > > > > > >> the kernel? > > > > > > > > > > > > The opportunity for that to deadlock isn't obvious to me, a replay > > > > > > would be stalled waiting for invalid vaddrs, but this is essentially > > > > > > the user deadlocking themselves. There's also code there to handle the > > > > > > process getting killed while waiting, making it interruptible. Thanks, > > > > > > > > > > I will submit new patches tomorrow to exclude mdevs. Almost done. > > > > > > > > I've dropped the removal commits from my next branch in the interim. > > > > > > Woah, please don't do that - I already built and sent pull requests > > > assuming this, there are conflicts. > > > > I've done merges both ways with your iommufd pull request and don't see > > any conflicts relative to these changes. Kconfig, Makefile, and > > vfio_main.c related to virq integration and group extraction are the > > only conflicts. > > I got an extra hunk in the header file > > > Besides, it's already pushed and I don't have any references to the > > old head, so someone would need to provide it if we wanted to keep > > the old hashes. > > https://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git/tag/?h=for-linus-iommufd-merged > https://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git/commit/?h=for-linus-iommufd-merged&id=e9a1f0f32d86c05f01878a0448384a46a453abc7 Ok, I do still have that reference around. Thanks. > > > Why would we not revert everything from 6.2 - that is what we agreed > > > to do? > > > > The decision to revert was based on the current interface being buggy, > > abandoned, and re-implemented. It doesn't seem that there's much future > > for the current interface, but Steve has stepped up to restrict the > > current implementation to non-mdev devices, which resolves your concern > > regarding unlimited user blocking of kernel threads afaict, and we'll > > see what he does with locked memory. > > Except nobody has seen this yet, and it can't go into 6.2 at this > point (see Linus's rather harsh remarks on late work for v6.2) We already outlined earlier in this thread the criteria that prompted us to tag the revert for stable, which was Steve's primary objection in the short term. I can't in good faith push forward with a revert, including stable, if Steve is working on a proposal to resolve the issues prompting us to accelerate the code removal. Depending on the scope of Steve's proposal, I think we might be able to still consider this a fix for v6.2. Thanks, Alex
On Mon, Dec 12, 2022 at 05:04:24PM -0700, Alex Williamson wrote: > > > The decision to revert was based on the current interface being buggy, > > > abandoned, and re-implemented. It doesn't seem that there's much future > > > for the current interface, but Steve has stepped up to restrict the > > > current implementation to non-mdev devices, which resolves your concern > > > regarding unlimited user blocking of kernel threads afaict, and we'll > > > see what he does with locked memory. > > > > Except nobody has seen this yet, and it can't go into 6.2 at this > > point (see Linus's rather harsh remarks on late work for v6.2) > > We already outlined earlier in this thread the criteria that prompted > us to tag the revert for stable, which was Steve's primary objection in > the short term. I still don't understand this, everyone running a distro deals with the stuff. Even if you do blindly pull from a -stable branch instead of cherry picking you only have to do the revert-revert once. Git is good at this stuff. Plus I have a doubt after all the backporting required to get vfio to the required state that -stable patches are even going to work anyhow.. > I can't in good faith push forward with a revert, including stable, > if Steve is working on a proposal to resolve the issues prompting us > to accelerate the code removal. Depending on the scope of Steve's > proposal, I think we might be able to still consider this a fix for > v6.2. Thanks, Well, IMHO, you are better to send it for v6.2-rc1 than try to squeeze it into this merge window and risk Linus's wrath Jason
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 02c6ea3bed69..731d8d4b6524 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -790,7 +790,6 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data, unsigned long remote_vaddr; struct vfio_dma *dma; bool do_accounting; - dma_addr_t iova; if (!iommu || !pages) return -EINVAL; @@ -815,6 +814,7 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data, do_accounting = list_empty(&iommu->domain_list); for (i = 0; i < npage; i++) { + dma_addr_t iova; unsigned long phys_pfn; struct vfio_pfn *vpfn; @@ -1467,7 +1467,6 @@ static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu, static int vfio_dma_do_map(struct vfio_iommu *iommu, struct vfio_iommu_type1_dma_map *map) { - bool set_vaddr = map->flags & VFIO_DMA_MAP_FLAG_VADDR; dma_addr_t iova = map->iova; unsigned long vaddr = map->vaddr; size_t size = map->size; @@ -1485,16 +1484,13 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, if (map->flags & VFIO_DMA_MAP_FLAG_READ) prot |= IOMMU_READ; - if ((prot && set_vaddr) || (!prot && !set_vaddr)) - return -EINVAL; - mutex_lock(&iommu->lock); pgsize = (size_t)1 << __ffs(iommu->pgsize_bitmap); WARN_ON((pgsize - 1) & PAGE_MASK); - if (!size || (size | iova | vaddr) & (pgsize - 1)) { + if (!prot || !size || (size | iova | vaddr) & (pgsize - 1)) { ret = -EINVAL; goto out_unlock; } @@ -1505,17 +1501,7 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, goto out_unlock; } - dma = vfio_find_dma(iommu, iova, size); - if (set_vaddr) { - if (!dma) { - ret = -ENOENT; - } else if (dma->iova != iova || dma->size != size) { - ret = -EINVAL; - } else { - dma->vaddr = vaddr; - } - goto out_unlock; - } else if (dma) { + if (vfio_find_dma(iommu, iova, size)) { ret = -EEXIST; goto out_unlock; } @@ -2727,8 +2713,7 @@ static int vfio_iommu_type1_map_dma(struct vfio_iommu *iommu, { struct vfio_iommu_type1_dma_map map; unsigned long minsz; - uint32_t mask = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE | - VFIO_DMA_MAP_FLAG_VADDR; + uint32_t mask = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE; minsz = offsetofend(struct vfio_iommu_type1_dma_map, size); diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 04d944c8941d..800ca94aafb3 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -49,8 +49,8 @@ /* Supports VFIO_DMA_UNMAP_FLAG_ALL */ #define VFIO_UNMAP_ALL 9 -/* Obsolete, not supported by any IOMMU. */ -#define VFIO_UPDATE_VADDR 10 +/* Probe for reverted vaddr removal and update support */ +#define VFIO_UPDATE_VADDR_DEPRECATED 10 /* * The IOCTL interface is designed for extensibility by embedding the @@ -1348,7 +1348,7 @@ struct vfio_iommu_type1_dma_map { __u32 flags; #define VFIO_DMA_MAP_FLAG_READ (1 << 0) /* readable from device */ #define VFIO_DMA_MAP_FLAG_WRITE (1 << 1) /* writable from device */ -#define VFIO_DMA_MAP_FLAG_VADDR (1 << 2) +#define VFIO_DMA_MAP_FLAG_VADDR_DEPRECATED (1 << 2) /* prior vaddr remapping */ __u64 vaddr; /* Process virtual address */ __u64 iova; /* IO virtual address */ __u64 size; /* Size of mapping (bytes) */ @@ -1390,6 +1390,7 @@ struct vfio_iommu_type1_dma_unmap { __u32 flags; #define VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP (1 << 0) #define VFIO_DMA_UNMAP_FLAG_ALL (1 << 1) +#define VFIO_DMA_UNMAP_FLAG_VADDR_DEPRECATED (1 << 2) /* prior vaddr removal */ __u64 iova; /* IO virtual address */ __u64 size; /* Size of mapping (bytes) */ __u8 data[];
Fix several loose ends relative to reverting support for vaddr removal and update. Mark feature and ioctl flags as deprecated, restore local variable scope in pin pages, remove remaining support in the mapping code. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> --- This applies on top of Steve's patch[1] to fully remove and deprecate this feature in the short term, following the same methodology we used for the v1 migration interface removal. The intention would be to pick Steve's patch and this follow-on for v6.2 given that existing support exposes vulnerabilities and no known upstream userspaces make use of this feature. [1]https://lore.kernel.org/all/1670363753-249738-2-git-send-email-steven.sistare@oracle.com/ drivers/vfio/vfio_iommu_type1.c | 23 ++++------------------- include/uapi/linux/vfio.h | 7 ++++--- 2 files changed, 8 insertions(+), 22 deletions(-)