diff mbox series

[09/17] mm: export access_remote_vm() symbol

Message ID 20230103163505.1569356-10-fenghua.yu@intel.com (mailing list archive)
State New
Headers show
Series None | expand

Commit Message

Fenghua Yu Jan. 3, 2023, 4:34 p.m. UTC
From: Dave Jiang <dave.jiang@intel.com>

Export access_remote_vm() symbol for driver usage. The idxd driver would
like to use it to write the user's completion record that the hardware
device is not able to write to due to user page fault.

Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
---
 mm/memory.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Lorenzo Stoakes Jan. 3, 2023, 5:45 p.m. UTC | #1
On Tue, Jan 03, 2023 at 08:34:57AM -0800, Fenghua Yu wrote:
> From: Dave Jiang <dave.jiang@intel.com>
>
> Export access_remote_vm() symbol for driver usage. The idxd driver would
> like to use it to write the user's completion record that the hardware
> device is not able to write to due to user page fault.
>
> Tested-by: Tony Zhu <tony.zhu@intel.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: linux-mm@kvack.org
> ---
>  mm/memory.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index aad226daf41b..caae4deff987 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -5579,6 +5579,7 @@ int access_remote_vm(struct mm_struct *mm, unsigned long addr,
>  {
>  	return __access_remote_vm(mm, addr, buf, len, gup_flags);
>  }
> +EXPORT_SYMBOL_GPL(access_remote_vm);
>
>  /*
>   * Access another process' address space.
> --
> 2.32.0
>

Can you explain what use case you have for exporting this? Currently this is
only used by procfs.

Additionally, it relies on a reference count being held on the mm which seems a
little risky exposing to a driver.

Is there a reason you can't use access_process_vm() which is exported and
additionally handles the refrencing?
Lorenzo Stoakes Jan. 3, 2023, 5:50 p.m. UTC | #2
On Tue, Jan 03, 2023 at 05:45:30PM +0000, Lorenzo Stoakes wrote:
> Can you explain what use case you have for exporting this? Currently this is
> only used by procfs.
>

Ugh I hit send too soon ! I see you've explained the _why_ (i.e. idxd). The
other two points remain, however.

> Additionally, it relies on a reference count being held on the mm which seems a
> little risky exposing to a driver.
>
> Is there a reason you can't use access_process_vm() which is exported and
> additionally handles the refrencing?
Fenghua Yu Jan. 3, 2023, 7:20 p.m. UTC | #3
Hi, Lorenzo,

> On Tue, Jan 03, 2023 at 09:46:00AM -0800, Lorenzo Stoakes wrote:
> On Tue, Jan 03, 2023 at 05:45:30PM +0000, Lorenzo Stoakes wrote:
> > Can you explain what use case you have for exporting this? Currently
> > this is only used by procfs.
> >
> 
> Ugh I hit send too soon ! I see you've explained the _why_ (i.e. idxd). The other
> two points remain, however.
> 
> > Additionally, it relies on a reference count being held on the mm
> > which seems a little risky exposing to a driver.

access_remote_vm(mm) directly call __access_remote_vm(mm).
access_process_vm(tsk) calls mm=get_task_mm() then __access_remote_vm(mm).

So instead of access_remote_vm(mm), it's access_process_vm(tsk) that holds
a reference count on the mm, right?

> >
> > Is there a reason you can't use access_process_vm() which is exported
> > and additionally handles the refrencing?

IDXD interrupt handler starts a work which needs to access remote vm.
The remote mm is found by PASID which is saved in device event log.

In the work, it's hard to get the remote mm from a task because mm->owner could be NULL
but the mm is still existing.

Thanks.

-Fenghua
Lorenzo Stoakes Jan. 3, 2023, 8:13 p.m. UTC | #4
On Tue, Jan 03, 2023 at 07:20:11PM +0000, Yu, Fenghua wrote:
> Hi, Lorenzo,
>

Hey Fenghua :)

> access_remote_vm(mm) directly call __access_remote_vm(mm).
> access_process_vm(tsk) calls mm=get_task_mm() then __access_remote_vm(mm).
>
> So instead of access_remote_vm(mm), it's access_process_vm(tsk) that holds
> a reference count on the mm, right?

Indeed!

>
> > >
> > > Is there a reason you can't use access_process_vm() which is exported
> > > and additionally handles the refrencing?
>
> IDXD interrupt handler starts a work which needs to access remote vm.
> The remote mm is found by PASID which is saved in device event log.
>
> In the work, it's hard to get the remote mm from a task because mm->owner could be NULL
> but the mm is still existing.

That makes sense, however I do feel nervous about exporting something that that
relies on this reference.

The issue is ensuring that the mm can't be taken from underneath you, the only
user of access_remote_vm(), procfs, does a careful dance using get_task_mm() and
mm_access() to ensure this can't happen, if _sometimes_ the remote mm might have
an owner and _sometimes_ not it feels like any exported function needs to be
equally careful?

I definitely don't feel as if simply exporting this is a safe option, and you
would in that case need a new function that handles different scenarios of mm
ownership/not.

I may be missing something here and I will wait for others to chime in but I
think we would definitely need something more than simply exporting this.
Fenghua Yu Jan. 4, 2023, 5:06 a.m. UTC | #5
Hi, Lorenzo,

> Hey Fenghua :)
> 
> > access_remote_vm(mm) directly call __access_remote_vm(mm).
> > access_process_vm(tsk) calls mm=get_task_mm() then
> __access_remote_vm(mm).
> >
> > So instead of access_remote_vm(mm), it's access_process_vm(tsk) that
> > holds a reference count on the mm, right?
> 
> Indeed!
> 
> >
> > > >
> > > > Is there a reason you can't use access_process_vm() which is
> > > > exported and additionally handles the refrencing?
> >
> > IDXD interrupt handler starts a work which needs to access remote vm.
> > The remote mm is found by PASID which is saved in device event log.
> >
> > In the work, it's hard to get the remote mm from a task because
> > mm->owner could be NULL but the mm is still existing.
> 
> That makes sense, however I do feel nervous about exporting something that
> that relies on this reference.
> 
> The issue is ensuring that the mm can't be taken from underneath you, the only
> user of access_remote_vm(), procfs, does a careful dance using get_task_mm()
> and
> mm_access() to ensure this can't happen, if _sometimes_ the remote mm might
> have an owner and _sometimes_ not it feels like any exported function needs to
> be equally careful?
> 
> I definitely don't feel as if simply exporting this is a safe option, and you would in
> that case need a new function that handles different scenarios of mm
> ownership/not.
> 
> I may be missing something here and I will wait for others to chime in but I think
> we would definitely need something more than simply exporting this.

I may define and export a new wrapper access_remote_vm_ref() which will hold
mm's reference count before accessing it:
int access_remote_vm_ref(mm)
{
   int ret;

   if (mm == &init_mm)
        return 0;

   mmget(mm);
   ret = access_remote_vm(mm);
   mmput(mm);

   return ret;
}
EXPORT_SYMBOL_GPL(access_remote_vm_ref);

IDXD or any driver calls this and holds mm reference count while accesses the mm.
This is useful for caller to directly access mm even if mm's owner could be NULL.

Do you think this is sufficient to take care of the mm reference and is a good way to go?

Thanks.

-Fenghua
Alistair Popple Jan. 4, 2023, 6:12 a.m. UTC | #6
"Yu, Fenghua" <fenghua.yu@intel.com> writes:

> Hi, Lorenzo,
>
>> Hey Fenghua :)
>> 
>> > access_remote_vm(mm) directly call __access_remote_vm(mm).
>> > access_process_vm(tsk) calls mm=get_task_mm() then
>> __access_remote_vm(mm).
>> >
>> > So instead of access_remote_vm(mm), it's access_process_vm(tsk) that
>> > holds a reference count on the mm, right?
>> 
>> Indeed!
>> 
>> >
>> > > >
>> > > > Is there a reason you can't use access_process_vm() which is
>> > > > exported and additionally handles the refrencing?
>> >
>> > IDXD interrupt handler starts a work which needs to access remote vm.
>> > The remote mm is found by PASID which is saved in device event log.
>> >
>> > In the work, it's hard to get the remote mm from a task because
>> > mm->owner could be NULL but the mm is still existing.
>> 
>> That makes sense, however I do feel nervous about exporting something that
>> that relies on this reference.
>> 
>> The issue is ensuring that the mm can't be taken from underneath you, the only
>> user of access_remote_vm(), procfs, does a careful dance using get_task_mm()
>> and
>> mm_access() to ensure this can't happen, if _sometimes_ the remote mm might
>> have an owner and _sometimes_ not it feels like any exported function needs to
>> be equally careful?

I think the point is the remote mm should be valid as long as the PASID
is valid because it doesn't make sense to have a PASID without
associated memory map. iommu_sva_find() does mmget_not_zero() to ensure
that.

Obviously something must still be holding a mmgrab() though. That should
happen as part of the PASID allocation done by iommu_sva_bind_device().
 
>> I definitely don't feel as if simply exporting this is a safe option, and you would in
>> that case need a new function that handles different scenarios of mm
>> ownership/not.

Note this isn't that different from get_user_pages_remote().
 
>> I may be missing something here and I will wait for others to chime in but I think
>> we would definitely need something more than simply exporting this.
>
> I may define and export a new wrapper access_remote_vm_ref() which will hold
> mm's reference count before accessing it:
> int access_remote_vm_ref(mm)
> {
>    int ret;
>
>    if (mm == &init_mm)
>         return 0;
>
>    mmget(mm);
>    ret = access_remote_vm(mm);
>    mmput(mm);
>
>    return ret;
> }
> EXPORT_SYMBOL_GPL(access_remote_vm_ref);
>
> IDXD or any driver calls this and holds mm reference count while accesses the mm.
> This is useful for caller to directly access mm even if mm's owner could be NULL.

I'm not sure that helps much. A driver would still need to hold a
mm_count to ensure the struct_mm itself can't go away anyway so it may
as well do the mmget() IMHO (although it really should be
mmget_not_zero()).

In any case though iommu_sva_find() already takes care of doing
mmget_not_zero(). I wonder if it makes more sense to define a wrapper
(eg. iommu_access_pasid) that takes a PASID and does the mm
lookup/access_vm/mmput?

> Do you think this is sufficient to take care of the mm reference and is a good way to go?
>
> Thanks.
>
> -Fenghua
Fenghua Yu Jan. 4, 2023, 7 p.m. UTC | #7
Hi, Alistair and Lorenzo,

> >> > access_remote_vm(mm) directly call __access_remote_vm(mm).
> >> > access_process_vm(tsk) calls mm=get_task_mm() then
> >> __access_remote_vm(mm).
> >> >
> >> > So instead of access_remote_vm(mm), it's access_process_vm(tsk)
> >> > that holds a reference count on the mm, right?
> >>
> >> Indeed!
> >>
> >> >
> >> > > >
> >> > > > Is there a reason you can't use access_process_vm() which is
> >> > > > exported and additionally handles the refrencing?
> >> >
> >> > IDXD interrupt handler starts a work which needs to access remote vm.
> >> > The remote mm is found by PASID which is saved in device event log.
> >> >
> >> > In the work, it's hard to get the remote mm from a task because
> >> > mm->owner could be NULL but the mm is still existing.
> >>
> >> That makes sense, however I do feel nervous about exporting something
> >> that that relies on this reference.
> >>
> >> The issue is ensuring that the mm can't be taken from underneath you,
> >> the only user of access_remote_vm(), procfs, does a careful dance
> >> using get_task_mm() and
> >> mm_access() to ensure this can't happen, if _sometimes_ the remote mm
> >> might have an owner and _sometimes_ not it feels like any exported
> >> function needs to be equally careful?
> 
> I think the point is the remote mm should be valid as long as the PASID is valid
> because it doesn't make sense to have a PASID without associated memory map.
> iommu_sva_find() does mmget_not_zero() to ensure that.
> 
> Obviously something must still be holding a mmgrab() though. That should
> happen as part of the PASID allocation done by iommu_sva_bind_device().
> 
> >> I definitely don't feel as if simply exporting this is a safe option,
> >> and you would in that case need a new function that handles different
> >> scenarios of mm ownership/not.
> 
> Note this isn't that different from get_user_pages_remote().
> 
> >> I may be missing something here and I will wait for others to chime
> >> in but I think we would definitely need something more than simply exporting
> this.
> >
> > I may define and export a new wrapper access_remote_vm_ref() which
> > will hold mm's reference count before accessing it:
> > int access_remote_vm_ref(mm)
> > {
> >    int ret;
> >
> >    if (mm == &init_mm)
> >         return 0;
> >
> >    mmget(mm);
> >    ret = access_remote_vm(mm);
> >    mmput(mm);
> >
> >    return ret;
> > }
> > EXPORT_SYMBOL_GPL(access_remote_vm_ref);
> >
> > IDXD or any driver calls this and holds mm reference count while accesses the
> mm.
> > This is useful for caller to directly access mm even if mm's owner could be
> NULL.
> 
> I'm not sure that helps much. A driver would still need to hold a mm_count to
> ensure the struct_mm itself can't go away anyway so it may as well do the
> mmget() IMHO (although it really should be mmget_not_zero()).
> 
> In any case though iommu_sva_find() already takes care of doing
> mmget_not_zero().

That's right. IDXD driver calls iommu_sva_find() which holds mm reference before
access_remote_vm(mm) and puts the count after.

And comment of access_remote_vm() explicitly says "The caller must hold a reference on @mm.".
IDXD follows the mm reference policy. There is no need to have a different wrapper.

So the current patch is good without any change, right?

> I wonder if it makes more sense to define a wrapper (eg.
> iommu_access_pasid) that takes a PASID and does the mm
> lookup/access_vm/mmput?

Currently access_remove_vm() is called only once in IDXD. And the calling code
is clearly to have mmget() and mmput() already. The proposed
wrapper iommu_access_pasid() may not be very useful. We may add the wrapper
in the future if there are more usages. Is that OK?

Thanks.

-Fenghua
Lorenzo Stoakes Jan. 4, 2023, 7:56 p.m. UTC | #8
On Wed, Jan 04, 2023 at 05:12:31PM +1100, Alistair Popple wrote:
> Obviously something must still be holding a mmgrab() though. That should
> happen as part of the PASID allocation done by iommu_sva_bind_device().

I'm not familiar with the iommu code, but a brief glance suggests that no
mmgrab() is being performed for intel devices? I may have missed something
though.

We do need to be absolutely sure the mm sticks around (hence the
grab) but if we need userland mappings that we have to have a subsequent
mmget_not_zero().

> >> I definitely don't feel as if simply exporting this is a safe option, and you would in
> >> that case need a new function that handles different scenarios of mm
> >> ownership/not.
>
> Note this isn't that different from get_user_pages_remote().

get_user_pages_remote() differs in that, most importantly, it requires the
mm_lock is held on invocation (implying that the mm will stick around), which is
not the case for access_remote_vm() (as __access_remote_vm() subsequently
obtains it), but also in that it pins pages but doesn't copy things to a buffer
(rather returning VMAs or struct page objects).

Also note the comment around get_user_pages_remote() saying nobody should be
using it due to lack of FAULT_FLAG_ALLOW_RETRY handling :) yes it feels like GUP
is a bit of a mess.

> In any case though iommu_sva_find() already takes care of doing
> mmget_not_zero(). I wonder if it makes more sense to define a wrapper
> (eg. iommu_access_pasid) that takes a PASID and does the mm
> lookup/access_vm/mmput?

My concern is exposing something highly delicate _which accesses remote mas a public API with implicit
assumptions whose one and only (core kernel) user treats with enormous
caution. Even if this iommu code were to use it correctly, we'd end up with an
interface which could be subject to real risks which other drivers may misuse.

Another question is - why can't we:-

1. mmgrab() [or safely assume we already have a reference] + mmget_not_zero()
2. acquire mm read lock to stop VMAs disappearing underneath us and pin pages with get_user_pages_remote()
3. copy what we need using e.g. copy_from_user()/copy_to_user()
4. unwind
Lorenzo Stoakes Jan. 4, 2023, 8 p.m. UTC | #9
On Wed, Jan 04, 2023 at 07:00:55PM +0000, Yu, Fenghua wrote:
> So the current patch is good without any change, right?

No, you'd definitely need mmget_not_zero() (and a grab in advance of this,
though from the sounds of it you may already have it), I definitely don't think
the patch as-is works, see my reply to Alistair.
Lorenzo Stoakes Jan. 4, 2023, 9:05 p.m. UTC | #10
On Wed, Jan 04, 2023 at 07:56:35PM +0000, Lorenzo Stoakes wrote:
> Another question is - why can't we:-
>
> 1. mmgrab() [or safely assume we already have a reference] + mmget_not_zero()
> 2. acquire mm read lock to stop VMAs disappearing underneath us and pin pages with get_user_pages_remote()
> 3. copy what we need using e.g. copy_from_user()/copy_to_user()
> 4. unwind

OK looking at __access_remote_vm() I just accidentally described exactly what it
does other than step 1 :)

Perhaps then the answer is a wrapper that gets the reference before
invoking __access_remote_vm()? I guess we could assume grab there.

It strikes me that access_remote_vm() being quite literally a pass through to
__access_remote_vm() means we could:-

a. change all callers of access_remote_vm() to use __access_remote_vm()
b. Update access_remote_vm() to be safer
c. finally, export access_remote_vm()

e.g.:-

int access_remote_vm(struct mm_struct *mm, unsigned long addr,
		void *buf, int len, unsigned int gup_flags)
{
	int ret;

	if (!mmget_not_zero(mm))
		return 0;

	ret = __access_remote_vm(mm, addr, buf, len, gup_flags);

	mmput(mm);
}
EXPORT_SYMBOL_GPL(access_remote_vm)
Alistair Popple Jan. 4, 2023, 11:57 p.m. UTC | #11
Lorenzo Stoakes <lstoakes@gmail.com> writes:

> On Wed, Jan 04, 2023 at 05:12:31PM +1100, Alistair Popple wrote:
>> Obviously something must still be holding a mmgrab() though. That should
>> happen as part of the PASID allocation done by iommu_sva_bind_device().
>
> I'm not familiar with the iommu code, but a brief glance suggests that no
> mmgrab() is being performed for intel devices? I may have missed something
> though.

I'm more familiar with the ARM side of things, but we can safely assume
we always have a valid mmgrab()/mm_count while the PASID is bound because
iommu_sva_bind_device() -> iommu_sva_domain_alloc() -> mmgrab().

> We do need to be absolutely sure the mm sticks around (hence the
> grab) but if we need userland mappings that we have to have a subsequent
> mmget_not_zero().

Yeah, iommu_sva_find() does take care of that though:

 * On success a reference to the mm is taken, and must be released with mmput().

>> >> I definitely don't feel as if simply exporting this is a safe option, and you would in
>> >> that case need a new function that handles different scenarios of mm
>> >> ownership/not.
>>
>> Note this isn't that different from get_user_pages_remote().
>
> get_user_pages_remote() differs in that, most importantly, it requires the
> mm_lock is held on invocation (implying that the mm will stick around), which is
> not the case for access_remote_vm() (as __access_remote_vm() subsequently
> obtains it), but also in that it pins pages but doesn't copy things to a buffer
> (rather returning VMAs or struct page objects).

Oh that makes sense.

> Also note the comment around get_user_pages_remote() saying nobody should be
> using it due to lack of FAULT_FLAG_ALLOW_RETRY handling :) yes it feels like GUP
> is a bit of a mess.
>
>> In any case though iommu_sva_find() already takes care of doing
>> mmget_not_zero(). I wonder if it makes more sense to define a wrapper
>> (eg. iommu_access_pasid) that takes a PASID and does the mm
>> lookup/access_vm/mmput?
>
> My concern is exposing something highly delicate _which accesses remote mas a public API with implicit
> assumptions whose one and only (core kernel) user treats with enormous
> caution. Even if this iommu code were to use it correctly, we'd end up with an
> interface which could be subject to real risks which other drivers may misuse.

Ok, although I think making this an iommu specific wrapper taking a
PASID rather than mm_struct would make the API more specific and less
likely to be misused as the mm_count/users lifetime issues could be
dealt with inside the core IOMMU code.

> Another question is - why can't we:-
>
> 1. mmgrab() [or safely assume we already have a reference] + mmget_not_zero()
> 2. acquire mm read lock to stop VMAs disappearing underneath us and pin pages with get_user_pages_remote()
> 3. copy what we need using e.g. copy_from_user()/copy_to_user()
> 4. unwind

Seems reasonable to me at least, but I don't have any strong opinions as
I only noticed this thread while trying to catch up on IOMMU
developments.
Fenghua Yu Jan. 5, 2023, 3:08 a.m. UTC | #12
Hi, Alistair,

> Alistair Popple <apopple@nvidia.com> writes:
> Lorenzo Stoakes <lstoakes@gmail.com> writes:
> > My concern is exposing something highly delicate _which accesses
> > remote mas a public API with implicit assumptions whose one and only
> > (core kernel) user treats with enormous caution. Even if this iommu
> > code were to use it correctly, we'd end up with an interface which could be
> subject to real risks which other drivers may misuse.
> 
> Ok, although I think making this an iommu specific wrapper taking a PASID
> rather than mm_struct would make the API more specific and less likely to be
> misused as the mm_count/users lifetime issues could be dealt with inside the
> core IOMMU code.

The iommu specific wrapper still needs to call access_remote_vm() which is
in generic mm. We cannot avoid to export access_remote_vm(), right?
Are you saying the iommu specific wrapper doesn't need to the mm code?
 
Thanks.

-Fenghua
Alistair Popple Jan. 5, 2023, 3:22 a.m. UTC | #13
"Yu, Fenghua" <fenghua.yu@intel.com> writes:

> Hi, Alistair,
>> Alistair Popple <apopple@nvidia.com> writes:
>> Lorenzo Stoakes <lstoakes@gmail.com> writes:
>> > My concern is exposing something highly delicate _which accesses
>> > remote mas a public API with implicit assumptions whose one and only
>> > (core kernel) user treats with enormous caution. Even if this iommu
>> > code were to use it correctly, we'd end up with an interface which could be
>> subject to real risks which other drivers may misuse.
>> 
>> Ok, although I think making this an iommu specific wrapper taking a PASID
>> rather than mm_struct would make the API more specific and less likely to be
>> misused as the mm_count/users lifetime issues could be dealt with inside the
>> core IOMMU code.
>
> The iommu specific wrapper still needs to call access_remote_vm() which is
> in generic mm. We cannot avoid to export access_remote_vm(), right?

The wrapper still needs to call access_remote_vm(), but that doesn't
imply access_remote_vm() needs to be exported. I think the logical place
to put the wrapper would be in iommu-sva.c which isn't built as a
module, so you would only have to EXPORT_SYMBOL_GPL the wrapper and not
access_remote_vm().

> Are you saying the iommu specific wrapper doesn't need to the mm code?
>
> Thanks.
>
> -Fenghua
Lorenzo Stoakes Jan. 5, 2023, 7:26 a.m. UTC | #14
On Thu, Jan 05, 2023 at 10:57:02AM +1100, Alistair Popple wrote:
> Ok, although I think making this an iommu specific wrapper taking a
> PASID rather than mm_struct would make the API more specific and less
> likely to be misused as the mm_count/users lifetime issues could be
> dealt with inside the core IOMMU code.

This sounds like the ideal approach, as long as we either mmgrab() or otherwise
ensure that the mm won't disappear underneath us, then use mmget_not_zero() to
get a reference in this wrapper function, I think we're gravy.
Fenghua Yu Jan. 5, 2023, 8:58 p.m. UTC | #15
Hi, Alistair,

> >> Alistair Popple <apopple@nvidia.com> writes:
> >> Lorenzo Stoakes <lstoakes@gmail.com> writes:
> >> > My concern is exposing something highly delicate _which accesses
> >> > remote mas a public API with implicit assumptions whose one and
> >> > only (core kernel) user treats with enormous caution. Even if this
> >> > iommu code were to use it correctly, we'd end up with an interface
> >> > which could be
> >> subject to real risks which other drivers may misuse.
> >>
> >> Ok, although I think making this an iommu specific wrapper taking a
> >> PASID rather than mm_struct would make the API more specific and less
> >> likely to be misused as the mm_count/users lifetime issues could be
> >> dealt with inside the core IOMMU code.
> >
> > The iommu specific wrapper still needs to call access_remote_vm()
> > which is in generic mm. We cannot avoid to export access_remote_vm(), right?
> 
> The wrapper still needs to call access_remote_vm(), but that doesn't imply
> access_remote_vm() needs to be exported. I think the logical place to put the
> wrapper would be in iommu-sva.c which isn't built as a module, so you would
> only have to EXPORT_SYMBOL_GPL the wrapper and not access_remote_vm().

This looks better than exporting access_remote_vm(). I will remove this patch
and write the IOMMU wrapper to call access_remote_vm() in v2.

Thanks.

-Fenghua
Lorenzo Stoakes Jan. 5, 2023, 9:04 p.m. UTC | #16
On Thu, Jan 05, 2023 at 08:58:08PM +0000, Yu, Fenghua wrote:
> This looks better than exporting access_remote_vm(). I will remove this patch
> and write the IOMMU wrapper to call access_remote_vm() in v2.

Thank you both for a very constructive conversation! I definitely think this is
the right way to go.

All the best, Lorenzo
Christoph Hellwig Jan. 8, 2023, 5:36 p.m. UTC | #17
Exporting access_remote_vm just seems like a horrible idea.

If a driver needs to access a different VM from a completion path
in some thread or workqueue (which I assume this does, if not please
explain the use case), it should use kthread_use_mm to associate the
mm struct and then just use all the normal uaccess helpers.
Fenghua Yu March 1, 2023, 11:39 p.m. UTC | #18
Hi, Christoph,

On 1/8/23 09:36, Christoph Hellwig wrote:
> Exporting access_remote_vm just seems like a horrible idea.
> 
> If a driver needs to access a different VM from a completion path
> in some thread or workqueue (which I assume this does, if not please
> explain the use case), it should use kthread_use_mm to associate the
> mm struct and then just use all the normal uaccess helpers.

To trigger fault by IDXD device, user app frees the fault_addr pages.
After kthread_use_mm(), the IDXD driver cannot directly call
copy_to_user() to copy data to fault_addr because
the pages are freed by the app. If the IDXD driver
tries to copy to the app's fault_addr, it needs to get all the pages,
kmap, copy_to_user_page() etc, basically majority of the 
__access_remote_vm() code. Without exporting access_remote_vm(),
the driver has to re-implement the function for the copy to work
in IDXD driver.

Maybe a simple exported IOMMU wrapper in iommu-sva.c can help here?
CONFIG_IOMMU_SVA is bool and needs to be set for Event Log to work.
So the wrapper is exported and can be called the IDXD driver or any
driver that is on top of IOMMU SVA.

No need to export access_remote_vm() any more.

Plus with this wrapper, no need to export iommu_sva_find()
(in an earlier patch). So the code will be more clean and concise.

Is this OK for you?

struct mm_struct
*iommu_access_remote_vm(ioasid_t pasid, unsigned long addr,
			void *buf, int len, unsigned int gup_flags, int *copied)
{
	struct mm_struct *mm;

	mm = iommu_sva_find(pasid);
	if (!IS_ERR_OR_NULL(mm)) {
		/* A reference on @mm has been held. */
		*copied = access_remote_vm(mm, addr, buf, len, gup_flags);
	}

	return mm;
}
EXPORT_SYMBOL_GPL(iommu_access_remote_vm);

Thanks.

-Fenghua
diff mbox series

Patch

diff --git a/mm/memory.c b/mm/memory.c
index aad226daf41b..caae4deff987 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5579,6 +5579,7 @@  int access_remote_vm(struct mm_struct *mm, unsigned long addr,
 {
 	return __access_remote_vm(mm, addr, buf, len, gup_flags);
 }
+EXPORT_SYMBOL_GPL(access_remote_vm);
 
 /*
  * Access another process' address space.