Message ID | 1671141424-81853-3-git-send-email-steven.sistare@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fixes for virtual address update | expand |
On Thu, Dec 15, 2022 at 01:56:59PM -0800, Steve Sistare wrote: > When a vfio container is preserved across exec, the task does not change, > but it gets a new mm with locked_vm=0. If the user later unmaps a dma > mapping, locked_vm underflows to a large unsigned value, and a subsequent > dma map request fails with ENOMEM in __account_locked_vm. > > To avoid underflow, grab and save the mm at the time a dma is mapped. > Use that mm when adjusting locked_vm, rather than re-acquiring the saved > task's mm, which may have changed. If the saved mm is dead, do nothing. > > Signed-off-by: Steve Sistare <steven.sistare@oracle.com> > --- > drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++------- > 1 file changed, 10 insertions(+), 7 deletions(-) Add fixes lines and a CC stable The subject should be more like 'vfio/typ1: Prevent corruption of mm->locked_vm via exec()' > @@ -1687,6 +1689,8 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, > get_task_struct(current->group_leader); > dma->task = current->group_leader; > dma->lock_cap = capable(CAP_IPC_LOCK); > + dma->mm = dma->task->mm; This should be current->mm, current->group_leader->mm is not quite the same thing (and maybe another bug, I'm not sure) Jason
On 12/16/2022 9:09 AM, Jason Gunthorpe wrote: > On Thu, Dec 15, 2022 at 01:56:59PM -0800, Steve Sistare wrote: >> When a vfio container is preserved across exec, the task does not change, >> but it gets a new mm with locked_vm=0. If the user later unmaps a dma >> mapping, locked_vm underflows to a large unsigned value, and a subsequent >> dma map request fails with ENOMEM in __account_locked_vm. >> >> To avoid underflow, grab and save the mm at the time a dma is mapped. >> Use that mm when adjusting locked_vm, rather than re-acquiring the saved >> task's mm, which may have changed. If the saved mm is dead, do nothing. >> >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> >> --- >> drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++------- >> 1 file changed, 10 insertions(+), 7 deletions(-) > > Add fixes lines and a CC stable This predates the update vaddr functionality, so AFAICT: Fixes: 73fa0d10d077 ("vfio: Type1 IOMMU implementation") I'll wait on cc'ing stable until alex has chimed in. > The subject should be more like 'vfio/typ1: Prevent corruption of mm->locked_vm via exec()' Underflow is a more precise description of the first corruption. How about: vfio/type1: Prevent underflow of locked_vm via exec() >> @@ -1687,6 +1689,8 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, >> get_task_struct(current->group_leader); >> dma->task = current->group_leader; >> dma->lock_cap = capable(CAP_IPC_LOCK); >> + dma->mm = dma->task->mm; > > This should be current->mm, current->group_leader->mm is not quite the > same thing (and maybe another bug, I'm not sure) When are they different -- when the leader is a zombie? BTW I just noticed I need to update the comments about mm preceding these lines. - Steve
On Fri, 16 Dec 2022 10:42:13 -0500 Steven Sistare <steven.sistare@oracle.com> wrote: > On 12/16/2022 9:09 AM, Jason Gunthorpe wrote: > > On Thu, Dec 15, 2022 at 01:56:59PM -0800, Steve Sistare wrote: > >> When a vfio container is preserved across exec, the task does not change, > >> but it gets a new mm with locked_vm=0. If the user later unmaps a dma > >> mapping, locked_vm underflows to a large unsigned value, and a subsequent > >> dma map request fails with ENOMEM in __account_locked_vm. > >> > >> To avoid underflow, grab and save the mm at the time a dma is mapped. > >> Use that mm when adjusting locked_vm, rather than re-acquiring the saved > >> task's mm, which may have changed. If the saved mm is dead, do nothing. > >> > >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> > >> --- > >> drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++------- > >> 1 file changed, 10 insertions(+), 7 deletions(-) > > > > Add fixes lines and a CC stable > > This predates the update vaddr functionality, so AFAICT: > > Fixes: 73fa0d10d077 ("vfio: Type1 IOMMU implementation") > > I'll wait on cc'ing stable until alex has chimed in. Technically, adding the stable Cc tag is still the correct approach per the stable process docs, but the Fixes: tag alone is generally sufficient to crank up the backport engines. The original implementation is probably the correct commit to identify, exec was certainly not considered there. Thanks, Alex > > The subject should be more like 'vfio/typ1: Prevent corruption of mm->locked_vm via exec()' > > Underflow is a more precise description of the first corruption. How about: > > vfio/type1: Prevent underflow of locked_vm via exec() > > >> @@ -1687,6 +1689,8 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, > >> get_task_struct(current->group_leader); > >> dma->task = current->group_leader; > >> dma->lock_cap = capable(CAP_IPC_LOCK); > >> + dma->mm = dma->task->mm; > > > > This should be current->mm, current->group_leader->mm is not quite the > > same thing (and maybe another bug, I'm not sure) > > When are they different -- when the leader is a zombie? > > BTW I just noticed I need to update the comments about mm preceding these lines. > > - Steve >
On 12/16/2022 11:10 AM, Alex Williamson wrote: > On Fri, 16 Dec 2022 10:42:13 -0500 > Steven Sistare <steven.sistare@oracle.com> wrote: > >> On 12/16/2022 9:09 AM, Jason Gunthorpe wrote: >>> On Thu, Dec 15, 2022 at 01:56:59PM -0800, Steve Sistare wrote: >>>> When a vfio container is preserved across exec, the task does not change, >>>> but it gets a new mm with locked_vm=0. If the user later unmaps a dma >>>> mapping, locked_vm underflows to a large unsigned value, and a subsequent >>>> dma map request fails with ENOMEM in __account_locked_vm. >>>> >>>> To avoid underflow, grab and save the mm at the time a dma is mapped. >>>> Use that mm when adjusting locked_vm, rather than re-acquiring the saved >>>> task's mm, which may have changed. If the saved mm is dead, do nothing. >>>> >>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> >>>> --- >>>> drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++------- >>>> 1 file changed, 10 insertions(+), 7 deletions(-) >>> >>> Add fixes lines and a CC stable >> >> This predates the update vaddr functionality, so AFAICT: >> >> Fixes: 73fa0d10d077 ("vfio: Type1 IOMMU implementation") >> >> I'll wait on cc'ing stable until alex has chimed in. > > Technically, adding the stable Cc tag is still the correct approach per > the stable process docs, but the Fixes: tag alone is generally > sufficient to crank up the backport engines. The original > implementation is probably the correct commit to identify, exec was > certainly not considered there. Thanks, Should I cc stable on the whole series, or re-send individually? If the latter, which ones? - Steve >>> The subject should be more like 'vfio/typ1: Prevent corruption of mm->locked_vm via exec()' >> >> Underflow is a more precise description of the first corruption. How about: >> >> vfio/type1: Prevent underflow of locked_vm via exec() >> >>>> @@ -1687,6 +1689,8 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, >>>> get_task_struct(current->group_leader); >>>> dma->task = current->group_leader; >>>> dma->lock_cap = capable(CAP_IPC_LOCK); >>>> + dma->mm = dma->task->mm; >>> >>> This should be current->mm, current->group_leader->mm is not quite the >>> same thing (and maybe another bug, I'm not sure) >> >> When are they different -- when the leader is a zombie? >> >> BTW I just noticed I need to update the comments about mm preceding these lines. >> >> - Steve >> >
On Fri, 16 Dec 2022 11:16:59 -0500 Steven Sistare <steven.sistare@oracle.com> wrote: > On 12/16/2022 11:10 AM, Alex Williamson wrote: > > On Fri, 16 Dec 2022 10:42:13 -0500 > > Steven Sistare <steven.sistare@oracle.com> wrote: > > > >> On 12/16/2022 9:09 AM, Jason Gunthorpe wrote: > >>> On Thu, Dec 15, 2022 at 01:56:59PM -0800, Steve Sistare wrote: > >>>> When a vfio container is preserved across exec, the task does not change, > >>>> but it gets a new mm with locked_vm=0. If the user later unmaps a dma > >>>> mapping, locked_vm underflows to a large unsigned value, and a subsequent > >>>> dma map request fails with ENOMEM in __account_locked_vm. > >>>> > >>>> To avoid underflow, grab and save the mm at the time a dma is mapped. > >>>> Use that mm when adjusting locked_vm, rather than re-acquiring the saved > >>>> task's mm, which may have changed. If the saved mm is dead, do nothing. > >>>> > >>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> > >>>> --- > >>>> drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++------- > >>>> 1 file changed, 10 insertions(+), 7 deletions(-) > >>> > >>> Add fixes lines and a CC stable > >> > >> This predates the update vaddr functionality, so AFAICT: > >> > >> Fixes: 73fa0d10d077 ("vfio: Type1 IOMMU implementation") > >> > >> I'll wait on cc'ing stable until alex has chimed in. > > > > Technically, adding the stable Cc tag is still the correct approach per > > the stable process docs, but the Fixes: tag alone is generally > > sufficient to crank up the backport engines. The original > > implementation is probably the correct commit to identify, exec was > > certainly not considered there. Thanks, > > Should I cc stable on the whole series, or re-send individually? If the > latter, which ones? Only for the Fixes: commits, and note that Cc: stable is effectively just a tag like Fixes:, you don't actually need to copy the stable@vger list on the patch (and IIRC you'll get some correctional email if you do). Thanks, Alex
On Fri, Dec 16, 2022 at 10:42:13AM -0500, Steven Sistare wrote: > On 12/16/2022 9:09 AM, Jason Gunthorpe wrote: > > On Thu, Dec 15, 2022 at 01:56:59PM -0800, Steve Sistare wrote: > >> When a vfio container is preserved across exec, the task does not change, > >> but it gets a new mm with locked_vm=0. If the user later unmaps a dma > >> mapping, locked_vm underflows to a large unsigned value, and a subsequent > >> dma map request fails with ENOMEM in __account_locked_vm. > >> > >> To avoid underflow, grab and save the mm at the time a dma is mapped. > >> Use that mm when adjusting locked_vm, rather than re-acquiring the saved > >> task's mm, which may have changed. If the saved mm is dead, do nothing. > >> > >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> > >> --- > >> drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++------- > >> 1 file changed, 10 insertions(+), 7 deletions(-) > > > > Add fixes lines and a CC stable > > This predates the update vaddr functionality, so AFAICT: > > Fixes: 73fa0d10d077 ("vfio: Type1 IOMMU implementation") > > I'll wait on cc'ing stable until alex has chimed in. Yes > > The subject should be more like 'vfio/typ1: Prevent corruption of mm->locked_vm via exec()' > > Underflow is a more precise description of the first corruption. How about: > > vfio/type1: Prevent underflow of locked_vm via exec() sure > >> @@ -1687,6 +1689,8 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, > >> get_task_struct(current->group_leader); > >> dma->task = current->group_leader; > >> dma->lock_cap = capable(CAP_IPC_LOCK); > >> + dma->mm = dma->task->mm; > > > > This should be current->mm, current->group_leader->mm is not quite the > > same thing (and maybe another bug, I'm not sure) > > When are they different -- when the leader is a zombie? I'm actually not sure if they can be different, but if they are different then group_leader is the wrong one. Better not to chance it Jason
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 144f5bb..cd49b656 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -100,6 +100,7 @@ struct vfio_dma { struct task_struct *task; struct rb_root pfn_list; /* Ex-user pinned pfn list */ unsigned long *bitmap; + struct mm_struct *mm; }; struct vfio_batch { @@ -420,8 +421,8 @@ static int vfio_lock_acct(struct vfio_dma *dma, long npage, bool async) if (!npage) return 0; - mm = async ? get_task_mm(dma->task) : dma->task->mm; - if (!mm) + mm = dma->mm; + if (async && !mmget_not_zero(mm)) return -ESRCH; /* process exited */ ret = mmap_write_lock_killable(mm); @@ -794,8 +795,8 @@ static int vfio_pin_page_external(struct vfio_dma *dma, unsigned long vaddr, struct mm_struct *mm; int ret; - mm = get_task_mm(dma->task); - if (!mm) + mm = dma->mm; + if (!mmget_not_zero(mm)) return -ENODEV; ret = vaddr_get_pfns(mm, vaddr, 1, dma->prot, pfn_base, pages); @@ -1180,6 +1181,7 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma) vfio_unmap_unpin(iommu, dma, true); vfio_unlink_dma(iommu, dma); put_task_struct(dma->task); + mmdrop(dma->mm); vfio_dma_bitmap_free(dma); if (dma->vaddr_invalid) { iommu->vaddr_invalid_count--; @@ -1687,6 +1689,8 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, get_task_struct(current->group_leader); dma->task = current->group_leader; dma->lock_cap = capable(CAP_IPC_LOCK); + dma->mm = dma->task->mm; + mmgrab(dma->mm); dma->pfn_list = RB_ROOT; @@ -3122,9 +3126,8 @@ static int vfio_iommu_type1_dma_rw_chunk(struct vfio_iommu *iommu, !(dma->prot & IOMMU_READ)) return -EPERM; - mm = get_task_mm(dma->task); - - if (!mm) + mm = dma->mm; + if (!mmget_not_zero(mm)) return -EPERM; if (kthread)
When a vfio container is preserved across exec, the task does not change, but it gets a new mm with locked_vm=0. If the user later unmaps a dma mapping, locked_vm underflows to a large unsigned value, and a subsequent dma map request fails with ENOMEM in __account_locked_vm. To avoid underflow, grab and save the mm at the time a dma is mapped. Use that mm when adjusting locked_vm, rather than re-acquiring the saved task's mm, which may have changed. If the saved mm is dead, do nothing. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> --- drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)