diff mbox series

[RFC,v2,1/7] IB/mlx5: Change ib_umem_odp_map_dma_single_page() to retain umem_mutex

Message ID b9974985069900c80b8ff9e6b0b0b346c1592910.1668157436.git.matsuda-daisuke@fujitsu.com (mailing list archive)
State Superseded
Headers show
Series On-Demand Paging on SoftRoCE | expand

Commit Message

Daisuke Matsuda (Fujitsu) Nov. 11, 2022, 9:22 a.m. UTC
ib_umem_odp_map_dma_single_page(), which has been used only by the mlx5
driver, holds umem_mutex on success and releases on failure. This
behavior is not convenient for other drivers to use it, so change it to
always retain mutex on return.

Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
---
 drivers/infiniband/core/umem_odp.c | 8 +++-----
 drivers/infiniband/hw/mlx5/odp.c   | 4 +++-
 2 files changed, 6 insertions(+), 6 deletions(-)

Comments

lizhijian@fujitsu.com Nov. 17, 2022, 1:20 p.m. UTC | #1
On 11/11/2022 17:22, Daisuke Matsuda wrote:
> ib_umem_odp_map_dma_single_page(), which has been used only by the mlx5
> driver, holds umem_mutex on success and releases on failure. This
> behavior is not convenient for other drivers to use it, so change it to
> always retain mutex on return.
> 
> Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
> ---
>   drivers/infiniband/core/umem_odp.c | 8 +++-----
>   drivers/infiniband/hw/mlx5/odp.c   | 4 +++-
>   2 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
> index e9fa22d31c23..49da6735f7c8 100644
> --- a/drivers/infiniband/core/umem_odp.c
> +++ b/drivers/infiniband/core/umem_odp.c
> @@ -328,8 +328,8 @@ static int ib_umem_odp_map_dma_single_page(
>    *
>    * Maps the range passed in the argument to DMA addresses.
>    * The DMA addresses of the mapped pages is updated in umem_odp->dma_list.
> - * Upon success the ODP MR will be locked to let caller complete its device
> - * page table update.
> + * The umem mutex is locked in this function. Callers are responsible for
> + * releasing the lock.
>    *


>    * Returns the number of pages mapped in success, negative error code
>    * for failure.
> @@ -453,11 +453,9 @@ int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt,
>   			break;
>   		}
>   	}
> -	/* upon success lock should stay on hold for the callee */
> +
>   	if (!ret)
>   		ret = dma_index - start_idx;
> -	else
> -		mutex_unlock(&umem_odp->umem_mutex);
>   
>   out_put_mm:
>   	mmput_async(owning_mm);
> diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
> index bc97958818bb..a0de27651586 100644
> --- a/drivers/infiniband/hw/mlx5/odp.c
> +++ b/drivers/infiniband/hw/mlx5/odp.c
> @@ -572,8 +572,10 @@ static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct ib_umem_odp *odp,
>   		access_mask |= ODP_WRITE_ALLOWED_BIT;
>   
>   	np = ib_umem_odp_map_dma_and_lock(odp, user_va, bcnt, access_mask, fault);
> -	if (np < 0)
> +	if (np < 0) {
> +		mutex_unlock(&odp->umem_mutex);
>   		return np;
> +	}

refer to the comments of ib_umem_odp_map_dma_and_lock:
334  * Returns the number of pages mapped in success, negative error 
code
335  * for failure.

I don't think it's correct to release the lock in all failure case, for 
example when it reaches below error path.

346 int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 
user_virt,
347                                  u64 bcnt, u64 access_mask, bool 
fault)
348                         __acquires(&umem_odp->umem_mutex) 

349 { 

350         struct task_struct *owning_process  = NULL; 

351         struct mm_struct *owning_mm = umem_odp->umem.owning_mm; 

352         int pfn_index, dma_index, ret = 0, start_idx; 

353         unsigned int page_shift, hmm_order, pfn_start_idx; 

354         unsigned long num_pfns, current_seq; 

355         struct hmm_range range = {}; 

356         unsigned long timeout; 

357 

358         if (access_mask == 0) 

359                 return -EINVAL;   <<<<<   no lock is hold yet 

360 

361         if (user_virt < ib_umem_start(umem_odp) || 

362             user_virt + bcnt > ib_umem_end(umem_odp)) 

363                 return -EFAULT;   <<<<<   no lock is hold yet


Further more, you changed public API's the behavior, do it matter for 
other out-of-tree drivers which is using it, i'm not familiar with this, 
maybe kernel has no restriction on it ?


>   
>   	/*
>   	 * No need to check whether the MTTs really belong to this MR, since
Daisuke Matsuda (Fujitsu) Nov. 18, 2022, 5:48 a.m. UTC | #2
On Thu, Nov 17, 2022 10:21 PM Li, Zhijian wrote:
> On 11/11/2022 17:22, Daisuke Matsuda wrote:
> > ib_umem_odp_map_dma_single_page(), which has been used only by the mlx5
> > driver, holds umem_mutex on success and releases on failure. This
> > behavior is not convenient for other drivers to use it, so change it to
> > always retain mutex on return.
> >
> > Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
> > ---
> >   drivers/infiniband/core/umem_odp.c | 8 +++-----
> >   drivers/infiniband/hw/mlx5/odp.c   | 4 +++-
> >   2 files changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
> > index e9fa22d31c23..49da6735f7c8 100644
> > --- a/drivers/infiniband/core/umem_odp.c
> > +++ b/drivers/infiniband/core/umem_odp.c
> > @@ -328,8 +328,8 @@ static int ib_umem_odp_map_dma_single_page(
> >    *
> >    * Maps the range passed in the argument to DMA addresses.
> >    * The DMA addresses of the mapped pages is updated in umem_odp->dma_list.
> > - * Upon success the ODP MR will be locked to let caller complete its device
> > - * page table update.
> > + * The umem mutex is locked in this function. Callers are responsible for
> > + * releasing the lock.
> >    *
> 
> 
> >    * Returns the number of pages mapped in success, negative error code
> >    * for failure.
> > @@ -453,11 +453,9 @@ int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt,
> >   			break;
> >   		}
> >   	}
> > -	/* upon success lock should stay on hold for the callee */
> > +
> >   	if (!ret)
> >   		ret = dma_index - start_idx;
> > -	else
> > -		mutex_unlock(&umem_odp->umem_mutex);
> >
> >   out_put_mm:
> >   	mmput_async(owning_mm);
> > diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
> > index bc97958818bb..a0de27651586 100644
> > --- a/drivers/infiniband/hw/mlx5/odp.c
> > +++ b/drivers/infiniband/hw/mlx5/odp.c
> > @@ -572,8 +572,10 @@ static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct ib_umem_odp *odp,
> >   		access_mask |= ODP_WRITE_ALLOWED_BIT;
> >
> >   	np = ib_umem_odp_map_dma_and_lock(odp, user_va, bcnt, access_mask, fault);
> > -	if (np < 0)
> > +	if (np < 0) {
> > +		mutex_unlock(&odp->umem_mutex);
> >   		return np;
> > +	}
> 
> refer to the comments of ib_umem_odp_map_dma_and_lock:
> 334  * Returns the number of pages mapped in success, negative error
> code
> 335  * for failure.
> 
> I don't think it's correct to release the lock in all failure case, for
> example when it reaches below error path.

Thank you. That's certainly true. I will fix this in v3.
Probably I will drop this patch and make changes on rxe side instead.

> 
> 346 int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64
> user_virt,
> 347                                  u64 bcnt, u64 access_mask, bool
> fault)
> 348                         __acquires(&umem_odp->umem_mutex)
> 
> 349 {
> 
> 350         struct task_struct *owning_process  = NULL;
> 
> 351         struct mm_struct *owning_mm = umem_odp->umem.owning_mm;
> 
> 352         int pfn_index, dma_index, ret = 0, start_idx;
> 
> 353         unsigned int page_shift, hmm_order, pfn_start_idx;
> 
> 354         unsigned long num_pfns, current_seq;
> 
> 355         struct hmm_range range = {};
> 
> 356         unsigned long timeout;
> 
> 357
> 
> 358         if (access_mask == 0)
> 
> 359                 return -EINVAL;   <<<<<   no lock is hold yet
> 
> 360
> 
> 361         if (user_virt < ib_umem_start(umem_odp) ||
> 
> 362             user_virt + bcnt > ib_umem_end(umem_odp))
> 
> 363                 return -EFAULT;   <<<<<   no lock is hold yet
> 
> 
> Further more, you changed public API's the behavior, do it matter for
> other out-of-tree drivers which is using it, i'm not familiar with this,
> maybe kernel has no restriction on it ?

Yes, they are just left behind. The developers have to modify the driver
by themselves if they want to keep it compatible with the upstream.
I think this is one of the general reasons why companies contribute
their works to OSS communities. They can lower the cost of maintaining
their software while getting benefits from new features.

Cf. The Linux Kernel Driver Interface
https://www.kernel.org/doc/html/latest/process/stable-api-nonsense.html

Thanks,
Daisuke

> 
> 
> >
> >   	/*
> >   	 * No need to check whether the MTTs really belong to this MR, since
diff mbox series

Patch

diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index e9fa22d31c23..49da6735f7c8 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -328,8 +328,8 @@  static int ib_umem_odp_map_dma_single_page(
  *
  * Maps the range passed in the argument to DMA addresses.
  * The DMA addresses of the mapped pages is updated in umem_odp->dma_list.
- * Upon success the ODP MR will be locked to let caller complete its device
- * page table update.
+ * The umem mutex is locked in this function. Callers are responsible for
+ * releasing the lock.
  *
  * Returns the number of pages mapped in success, negative error code
  * for failure.
@@ -453,11 +453,9 @@  int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt,
 			break;
 		}
 	}
-	/* upon success lock should stay on hold for the callee */
+
 	if (!ret)
 		ret = dma_index - start_idx;
-	else
-		mutex_unlock(&umem_odp->umem_mutex);
 
 out_put_mm:
 	mmput_async(owning_mm);
diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
index bc97958818bb..a0de27651586 100644
--- a/drivers/infiniband/hw/mlx5/odp.c
+++ b/drivers/infiniband/hw/mlx5/odp.c
@@ -572,8 +572,10 @@  static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct ib_umem_odp *odp,
 		access_mask |= ODP_WRITE_ALLOWED_BIT;
 
 	np = ib_umem_odp_map_dma_and_lock(odp, user_va, bcnt, access_mask, fault);
-	if (np < 0)
+	if (np < 0) {
+		mutex_unlock(&odp->umem_mutex);
 		return np;
+	}
 
 	/*
 	 * No need to check whether the MTTs really belong to this MR, since