[V5,2/7] vfio/type1: prevent locked_vm underflow

Message ID	1671141424-81853-3-git-send-email-steven.sistare@oracle.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> From: Steve Sistare <steven.sistare@oracle.com> To: kvm@vger.kernel.org Cc: Alex Williamson <alex.williamson@redhat.com>, Cornelia Huck <cohuck@redhat.com>, Jason Gunthorpe <jgg@nvidia.com>, Kevin Tian <kevin.tian@intel.com>, Steve Sistare <steven.sistare@oracle.com> Subject: [PATCH V5 2/7] vfio/type1: prevent locked_vm underflow Date: Thu, 15 Dec 2022 13:56:59 -0800 Message-Id: <1671141424-81853-3-git-send-email-steven.sistare@oracle.com> In-Reply-To: <1671141424-81853-1-git-send-email-steven.sistare@oracle.com> References: <1671141424-81853-1-git-send-email-steven.sistare@oracle.com> Precedence: bulk
Series	fixes for virtual address update \| expand [V5,0/7] fixes for virtual address update [V5,1/7] vfio/type1: exclude mdevs from VFIO_UPDATE_VADDR [V5,2/7] vfio/type1: prevent locked_vm underflow [V5,3/7] vfio/type1: count reserved pages [V5,4/7] vfio/type1: restore locked_vm [V5,5/7] vfio/type1: revert "block on invalid vaddr" [V5,6/7] vfio/type1: revert "implement notify callback" [V5,7/7] vfio: revert "iommu driver notify callback"

Steven Sistare Dec. 15, 2022, 9:56 p.m. UTC

When a vfio container is preserved across exec, the task does not change,
but it gets a new mm with locked_vm=0.  If the user later unmaps a dma
mapping, locked_vm underflows to a large unsigned value, and a subsequent
dma map request fails with ENOMEM in __account_locked_vm.

To avoid underflow, grab and save the mm at the time a dma is mapped.
Use that mm when adjusting locked_vm, rather than re-acquiring the saved
task's mm, which may have changed.  If the saved mm is dead, do nothing.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

Jason Gunthorpe Dec. 16, 2022, 2:09 p.m. UTC | #1

On Thu, Dec 15, 2022 at 01:56:59PM -0800, Steve Sistare wrote:
> When a vfio container is preserved across exec, the task does not change,
> but it gets a new mm with locked_vm=0.  If the user later unmaps a dma
> mapping, locked_vm underflows to a large unsigned value, and a subsequent
> dma map request fails with ENOMEM in __account_locked_vm.
> 
> To avoid underflow, grab and save the mm at the time a dma is mapped.
> Use that mm when adjusting locked_vm, rather than re-acquiring the saved
> task's mm, which may have changed.  If the saved mm is dead, do nothing.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++-------
>  1 file changed, 10 insertions(+), 7 deletions(-)

Add fixes lines and a CC stable

The subject should be more like 'vfio/typ1: Prevent corruption of mm->locked_vm via exec()'

> @@ -1687,6 +1689,8 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>  	get_task_struct(current->group_leader);
>  	dma->task = current->group_leader;
>  	dma->lock_cap = capable(CAP_IPC_LOCK);
> +	dma->mm = dma->task->mm;

This should be current->mm, current->group_leader->mm is not quite the
same thing (and maybe another bug, I'm not sure)

Jason

Steven Sistare Dec. 16, 2022, 3:42 p.m. UTC | #2

On 12/16/2022 9:09 AM, Jason Gunthorpe wrote:
> On Thu, Dec 15, 2022 at 01:56:59PM -0800, Steve Sistare wrote:
>> When a vfio container is preserved across exec, the task does not change,
>> but it gets a new mm with locked_vm=0.  If the user later unmaps a dma
>> mapping, locked_vm underflows to a large unsigned value, and a subsequent
>> dma map request fails with ENOMEM in __account_locked_vm.
>>
>> To avoid underflow, grab and save the mm at the time a dma is mapped.
>> Use that mm when adjusting locked_vm, rather than re-acquiring the saved
>> task's mm, which may have changed.  If the saved mm is dead, do nothing.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>>  drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++-------
>>  1 file changed, 10 insertions(+), 7 deletions(-)
> 
> Add fixes lines and a CC stable

This predates the update vaddr functionality, so AFAICT:

    Fixes: 73fa0d10d077 ("vfio: Type1 IOMMU implementation")

I'll wait on cc'ing stable until alex has chimed in.

> The subject should be more like 'vfio/typ1: Prevent corruption of mm->locked_vm via exec()'

Underflow is a more precise description of the first corruption. How about:

vfio/type1: Prevent underflow of locked_vm via exec()

>> @@ -1687,6 +1689,8 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>>  	get_task_struct(current->group_leader);
>>  	dma->task = current->group_leader;
>>  	dma->lock_cap = capable(CAP_IPC_LOCK);
>> +	dma->mm = dma->task->mm;
> 
> This should be current->mm, current->group_leader->mm is not quite the
> same thing (and maybe another bug, I'm not sure)

When are they different -- when the leader is a zombie?

BTW I just noticed I need to update the comments about mm preceding these lines.

- Steve

Alex Williamson Dec. 16, 2022, 4:10 p.m. UTC | #3

On Fri, 16 Dec 2022 10:42:13 -0500
Steven Sistare <steven.sistare@oracle.com> wrote:

> On 12/16/2022 9:09 AM, Jason Gunthorpe wrote:
> > On Thu, Dec 15, 2022 at 01:56:59PM -0800, Steve Sistare wrote:  
> >> When a vfio container is preserved across exec, the task does not change,
> >> but it gets a new mm with locked_vm=0.  If the user later unmaps a dma
> >> mapping, locked_vm underflows to a large unsigned value, and a subsequent
> >> dma map request fails with ENOMEM in __account_locked_vm.
> >>
> >> To avoid underflow, grab and save the mm at the time a dma is mapped.
> >> Use that mm when adjusting locked_vm, rather than re-acquiring the saved
> >> task's mm, which may have changed.  If the saved mm is dead, do nothing.
> >>
> >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> >> ---
> >>  drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++-------
> >>  1 file changed, 10 insertions(+), 7 deletions(-)  
> > 
> > Add fixes lines and a CC stable  
> 
> This predates the update vaddr functionality, so AFAICT:
> 
>     Fixes: 73fa0d10d077 ("vfio: Type1 IOMMU implementation")
> 
> I'll wait on cc'ing stable until alex has chimed in.

Technically, adding the stable Cc tag is still the correct approach per
the stable process docs, but the Fixes: tag alone is generally
sufficient to crank up the backport engines.  The original
implementation is probably the correct commit to identify, exec was
certainly not considered there.  Thanks,

Alex
 
> > The subject should be more like 'vfio/typ1: Prevent corruption of mm->locked_vm via exec()'  
> 
> Underflow is a more precise description of the first corruption. How about:
> 
> vfio/type1: Prevent underflow of locked_vm via exec()
> 
> >> @@ -1687,6 +1689,8 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
> >>  	get_task_struct(current->group_leader);
> >>  	dma->task = current->group_leader;
> >>  	dma->lock_cap = capable(CAP_IPC_LOCK);
> >> +	dma->mm = dma->task->mm;  
> > 
> > This should be current->mm, current->group_leader->mm is not quite the
> > same thing (and maybe another bug, I'm not sure)  
> 
> When are they different -- when the leader is a zombie?
> 
> BTW I just noticed I need to update the comments about mm preceding these lines.
> 
> - Steve
>

Steven Sistare Dec. 16, 2022, 4:16 p.m. UTC | #4

On 12/16/2022 11:10 AM, Alex Williamson wrote:
> On Fri, 16 Dec 2022 10:42:13 -0500
> Steven Sistare <steven.sistare@oracle.com> wrote:
> 
>> On 12/16/2022 9:09 AM, Jason Gunthorpe wrote:
>>> On Thu, Dec 15, 2022 at 01:56:59PM -0800, Steve Sistare wrote:  
>>>> When a vfio container is preserved across exec, the task does not change,
>>>> but it gets a new mm with locked_vm=0.  If the user later unmaps a dma
>>>> mapping, locked_vm underflows to a large unsigned value, and a subsequent
>>>> dma map request fails with ENOMEM in __account_locked_vm.
>>>>
>>>> To avoid underflow, grab and save the mm at the time a dma is mapped.
>>>> Use that mm when adjusting locked_vm, rather than re-acquiring the saved
>>>> task's mm, which may have changed.  If the saved mm is dead, do nothing.
>>>>
>>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>>> ---
>>>>  drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++-------
>>>>  1 file changed, 10 insertions(+), 7 deletions(-)  
>>>
>>> Add fixes lines and a CC stable  
>>
>> This predates the update vaddr functionality, so AFAICT:
>>
>>     Fixes: 73fa0d10d077 ("vfio: Type1 IOMMU implementation")
>>
>> I'll wait on cc'ing stable until alex has chimed in.
> 
> Technically, adding the stable Cc tag is still the correct approach per
> the stable process docs, but the Fixes: tag alone is generally
> sufficient to crank up the backport engines.  The original
> implementation is probably the correct commit to identify, exec was
> certainly not considered there.  Thanks,

Should I cc stable on the whole series, or re-send individually?  If the
latter, which ones?

- Steve
  
>>> The subject should be more like 'vfio/typ1: Prevent corruption of mm->locked_vm via exec()'  
>>
>> Underflow is a more precise description of the first corruption. How about:
>>
>> vfio/type1: Prevent underflow of locked_vm via exec()
>>
>>>> @@ -1687,6 +1689,8 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>>>>  	get_task_struct(current->group_leader);
>>>>  	dma->task = current->group_leader;
>>>>  	dma->lock_cap = capable(CAP_IPC_LOCK);
>>>> +	dma->mm = dma->task->mm;  
>>>
>>> This should be current->mm, current->group_leader->mm is not quite the
>>> same thing (and maybe another bug, I'm not sure)  
>>
>> When are they different -- when the leader is a zombie?
>>
>> BTW I just noticed I need to update the comments about mm preceding these lines.
>>
>> - Steve
>>
>

Alex Williamson Dec. 16, 2022, 4:33 p.m. UTC | #5

On Fri, 16 Dec 2022 11:16:59 -0500
Steven Sistare <steven.sistare@oracle.com> wrote:

> On 12/16/2022 11:10 AM, Alex Williamson wrote:
> > On Fri, 16 Dec 2022 10:42:13 -0500
> > Steven Sistare <steven.sistare@oracle.com> wrote:
> >   
> >> On 12/16/2022 9:09 AM, Jason Gunthorpe wrote:  
> >>> On Thu, Dec 15, 2022 at 01:56:59PM -0800, Steve Sistare wrote:    
> >>>> When a vfio container is preserved across exec, the task does not change,
> >>>> but it gets a new mm with locked_vm=0.  If the user later unmaps a dma
> >>>> mapping, locked_vm underflows to a large unsigned value, and a subsequent
> >>>> dma map request fails with ENOMEM in __account_locked_vm.
> >>>>
> >>>> To avoid underflow, grab and save the mm at the time a dma is mapped.
> >>>> Use that mm when adjusting locked_vm, rather than re-acquiring the saved
> >>>> task's mm, which may have changed.  If the saved mm is dead, do nothing.
> >>>>
> >>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> >>>> ---
> >>>>  drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++-------
> >>>>  1 file changed, 10 insertions(+), 7 deletions(-)    
> >>>
> >>> Add fixes lines and a CC stable    
> >>
> >> This predates the update vaddr functionality, so AFAICT:
> >>
> >>     Fixes: 73fa0d10d077 ("vfio: Type1 IOMMU implementation")
> >>
> >> I'll wait on cc'ing stable until alex has chimed in.  
> > 
> > Technically, adding the stable Cc tag is still the correct approach per
> > the stable process docs, but the Fixes: tag alone is generally
> > sufficient to crank up the backport engines.  The original
> > implementation is probably the correct commit to identify, exec was
> > certainly not considered there.  Thanks,  
> 
> Should I cc stable on the whole series, or re-send individually?  If the
> latter, which ones?

Only for the Fixes: commits, and note that Cc: stable is effectively
just a tag like Fixes:, you don't actually need to copy the stable@vger
list on the patch (and IIRC you'll get some correctional email if you
do).  Thanks,

Alex

Jason Gunthorpe Dec. 16, 2022, 5:07 p.m. UTC | #6

On Fri, Dec 16, 2022 at 10:42:13AM -0500, Steven Sistare wrote:
> On 12/16/2022 9:09 AM, Jason Gunthorpe wrote:
> > On Thu, Dec 15, 2022 at 01:56:59PM -0800, Steve Sistare wrote:
> >> When a vfio container is preserved across exec, the task does not change,
> >> but it gets a new mm with locked_vm=0.  If the user later unmaps a dma
> >> mapping, locked_vm underflows to a large unsigned value, and a subsequent
> >> dma map request fails with ENOMEM in __account_locked_vm.
> >>
> >> To avoid underflow, grab and save the mm at the time a dma is mapped.
> >> Use that mm when adjusting locked_vm, rather than re-acquiring the saved
> >> task's mm, which may have changed.  If the saved mm is dead, do nothing.
> >>
> >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> >> ---
> >>  drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++-------
> >>  1 file changed, 10 insertions(+), 7 deletions(-)
> > 
> > Add fixes lines and a CC stable
> 
> This predates the update vaddr functionality, so AFAICT:
> 
>     Fixes: 73fa0d10d077 ("vfio: Type1 IOMMU implementation")
> 
> I'll wait on cc'ing stable until alex has chimed in.

Yes

> > The subject should be more like 'vfio/typ1: Prevent corruption of mm->locked_vm via exec()'
> 
> Underflow is a more precise description of the first corruption. How about:
> 
> vfio/type1: Prevent underflow of locked_vm via exec()

sure
 
> >> @@ -1687,6 +1689,8 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
> >>  	get_task_struct(current->group_leader);
> >>  	dma->task = current->group_leader;
> >>  	dma->lock_cap = capable(CAP_IPC_LOCK);
> >> +	dma->mm = dma->task->mm;
> > 
> > This should be current->mm, current->group_leader->mm is not quite the
> > same thing (and maybe another bug, I'm not sure)
> 
> When are they different -- when the leader is a zombie?

I'm actually not sure if they can be different, but if they are
different then group_leader is the wrong one. Better not to chance it

Jason

[V5,2/7] vfio/type1: prevent locked_vm underflow

Commit Message

Comments

Patch