[v11,rdma-next,5/7] RDMA/qedr: Use the common mmap API
diff mbox series

Message ID 20190905100117.20879-6-michal.kalderon@marvell.com
State Superseded
Delegated to: Jason Gunthorpe
Headers show
Series
  • RDMA/qedr: Use the doorbell overflow recovery mechanism for RDMA
Related show

Commit Message

Michal Kalderon Sept. 5, 2019, 10:01 a.m. UTC
Remove all function related to mmap from qedr and use the common
API

Signed-off-by: Ariel Elior <ariel.elior@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
---
 drivers/infiniband/hw/qedr/main.c  |   1 +
 drivers/infiniband/hw/qedr/qedr.h  |  28 +++---
 drivers/infiniband/hw/qedr/verbs.c | 196 ++++++++++++++++++-------------------
 drivers/infiniband/hw/qedr/verbs.h |   3 +-
 4 files changed, 111 insertions(+), 117 deletions(-)

Comments

Jason Gunthorpe Sept. 19, 2019, 5:55 p.m. UTC | #1
On Thu, Sep 05, 2019 at 01:01:15PM +0300, Michal Kalderon wrote:

> -int qedr_mmap(struct ib_ucontext *context, struct vm_area_struct *vma)
> +void qedr_mmap_free(struct rdma_user_mmap_entry *rdma_entry)
>  {
> -	struct qedr_ucontext *ucontext = get_qedr_ucontext(context);
> -	struct qedr_dev *dev = get_qedr_dev(context->device);
> -	unsigned long phys_addr = vma->vm_pgoff << PAGE_SHIFT;
> -	unsigned long len = (vma->vm_end - vma->vm_start);
> -	unsigned long dpi_start;
> +	struct qedr_user_mmap_entry *entry = get_qedr_mmap_entry(rdma_entry);
>  
> -	dpi_start = dev->db_phys_addr + (ucontext->dpi * ucontext->dpi_size);
> -
> -	DP_DEBUG(dev, QEDR_MSG_INIT,
> -		 "mmap invoked with vm_start=0x%pK, vm_end=0x%pK,vm_pgoff=0x%pK; dpi_start=0x%pK dpi_size=0x%x\n",
> -		 (void *)vma->vm_start, (void *)vma->vm_end,
> -		 (void *)vma->vm_pgoff, (void *)dpi_start, ucontext->dpi_size);
> +	kfree(entry);
> +}

Huh. If you recall we did all this work with the XA and the free
callback because you said qedr was mmaping BAR pages that had some HW
lifetime associated with them, and the HW resource was not to be
reallocated until all users were gone.

I think it would be a better example of this API if you pulled the

 	dev->ops->rdma_remove_user(dev->rdma_ctx, ctx->dpi);

Into qedr_mmap_free().

Then the rdma_user_mmap_entry_remove() will call it naturally as it
does entry_put() and if we are destroying the ucontext we already know
the mmaps are destroyed.

Maybe the same basic comment for EFA, not sure. Gal?

Jason
Gal Pressman Sept. 20, 2019, 1:39 p.m. UTC | #2
On 19/09/2019 20:55, Jason Gunthorpe wrote:
> Huh. If you recall we did all this work with the XA and the free
> callback because you said qedr was mmaping BAR pages that had some HW
> lifetime associated with them, and the HW resource was not to be
> reallocated until all users were gone.
> 
> I think it would be a better example of this API if you pulled the
> 
>  	dev->ops->rdma_remove_user(dev->rdma_ctx, ctx->dpi);
> 
> Into qedr_mmap_free().
> 
> Then the rdma_user_mmap_entry_remove() will call it naturally as it
> does entry_put() and if we are destroying the ucontext we already know
> the mmaps are destroyed.
> 
> Maybe the same basic comment for EFA, not sure. Gal?

That's what EFA already does in this series, no?
We no longer remove entries on dealloc_ucontext, only when the entry is freed.
Michal Kalderon Sept. 23, 2019, 9:21 a.m. UTC | #3
> From: linux-rdma-owner@vger.kernel.org <linux-rdma-
> owner@vger.kernel.org> On Behalf Of Gal Pressman
> 
> On 19/09/2019 20:55, Jason Gunthorpe wrote:
> > Huh. If you recall we did all this work with the XA and the free
> > callback because you said qedr was mmaping BAR pages that had some HW
> > lifetime associated with them, and the HW resource was not to be
> > reallocated until all users were gone.
> >
> > I think it would be a better example of this API if you pulled the
> >
> >  	dev->ops->rdma_remove_user(dev->rdma_ctx, ctx->dpi);
> >
> > Into qedr_mmap_free().
> >
> > Then the rdma_user_mmap_entry_remove() will call it naturally as it
> > does entry_put() and if we are destroying the ucontext we already know
> > the mmaps are destroyed.
> >
> > Maybe the same basic comment for EFA, not sure. Gal?
> 
> That's what EFA already does in this series, no?
> We no longer remove entries on dealloc_ucontext, only when the entry is
> freed.

Actually, I think most of the discussions you had on the topic were with Gal, but
Some apply to qedr as well, however, for qedr, the only hw resource we allocate (bar)
is on alloc_ucontext , therefore we were safe to free it on dealloc_ucontext as all mappings
were already zapped. Making the mmap_free a bit redundant for qedr except for the need
to free the entry. 

For EFA, it seemed the only operation delayed was freeing memory - I didn't see hw resources
being freed... Gal?
Jason Gunthorpe Sept. 23, 2019, 1:29 p.m. UTC | #4
On Mon, Sep 23, 2019 at 09:21:37AM +0000, Michal Kalderon wrote:
> > From: linux-rdma-owner@vger.kernel.org <linux-rdma-
> > owner@vger.kernel.org> On Behalf Of Gal Pressman
> > 
> > On 19/09/2019 20:55, Jason Gunthorpe wrote:
> > > Huh. If you recall we did all this work with the XA and the free
> > > callback because you said qedr was mmaping BAR pages that had some HW
> > > lifetime associated with them, and the HW resource was not to be
> > > reallocated until all users were gone.
> > >
> > > I think it would be a better example of this API if you pulled the
> > >
> > >  	dev->ops->rdma_remove_user(dev->rdma_ctx, ctx->dpi);
> > >
> > > Into qedr_mmap_free().
> > >
> > > Then the rdma_user_mmap_entry_remove() will call it naturally as it
> > > does entry_put() and if we are destroying the ucontext we already know
> > > the mmaps are destroyed.
> > >
> > > Maybe the same basic comment for EFA, not sure. Gal?
> > 
> > That's what EFA already does in this series, no?
> > We no longer remove entries on dealloc_ucontext, only when the entry is
> > freed.
> 
> Actually, I think most of the discussions you had on the topic were with Gal, but
> Some apply to qedr as well, however, for qedr, the only hw resource we allocate (bar)
> is on alloc_ucontext , therefore we were safe to free it on dealloc_ucontext as all mappings
> were already zapped. Making the mmap_free a bit redundant for qedr except for the need
> to free the entry. 

I think if we are changing things to use this new interface then you
should consistenly code so that the lifetime of the object in the
mmap_entry is entirely controlled by the mmap_entry and not also split
to the ucontext.

Having mmapable object lifetimes linked to the ucontext is a bit of a
hack really

Jason
Michal Kalderon Sept. 24, 2019, 8:25 a.m. UTC | #5
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Monday, September 23, 2019 4:30 PM
> 
> External Email
> 
> ----------------------------------------------------------------------
> On Mon, Sep 23, 2019 at 09:21:37AM +0000, Michal Kalderon wrote:
> > > From: linux-rdma-owner@vger.kernel.org <linux-rdma-
> > > owner@vger.kernel.org> On Behalf Of Gal Pressman
> > >
> > > On 19/09/2019 20:55, Jason Gunthorpe wrote:
> > > > Huh. If you recall we did all this work with the XA and the free
> > > > callback because you said qedr was mmaping BAR pages that had some
> > > > HW lifetime associated with them, and the HW resource was not to
> > > > be reallocated until all users were gone.
> > > >
> > > > I think it would be a better example of this API if you pulled the
> > > >
> > > >  	dev->ops->rdma_remove_user(dev->rdma_ctx, ctx->dpi);
> > > >
> > > > Into qedr_mmap_free().
> > > >
> > > > Then the rdma_user_mmap_entry_remove() will call it naturally as
> > > > it does entry_put() and if we are destroying the ucontext we
> > > > already know the mmaps are destroyed.
> > > >
> > > > Maybe the same basic comment for EFA, not sure. Gal?
> > >
> > > That's what EFA already does in this series, no?
> > > We no longer remove entries on dealloc_ucontext, only when the entry
> > > is freed.
> >
> > Actually, I think most of the discussions you had on the topic were
> > with Gal, but Some apply to qedr as well, however, for qedr, the only
> > hw resource we allocate (bar) is on alloc_ucontext , therefore we were
> > safe to free it on dealloc_ucontext as all mappings were already
> > zapped. Making the mmap_free a bit redundant for qedr except for the
> need to free the entry.
> 
> I think if we are changing things to use this new interface then you should
> consistenly code so that the lifetime of the object in the mmap_entry is
> entirely controlled by the mmap_entry and not also split to the ucontext.
> 
> Having mmapable object lifetimes linked to the ucontext is a bit of a hack
> really
> 
Ok, but the ucontext "hack" will need to remain for other drivers. 

> Jason
Gal Pressman Sept. 24, 2019, 8:49 a.m. UTC | #6
> On 23 Sep 2019, at 18:22, Michal Kalderon <mkalderon@marvell.com> wrote:
> 
> 
>> 
>> From: linux-rdma-owner@vger.kernel.org <linux-rdma-
>> owner@vger.kernel.org> On Behalf Of Gal Pressman
>> 
>>> On 19/09/2019 20:55, Jason Gunthorpe wrote:
>>> Huh. If you recall we did all this work with the XA and the free
>>> callback because you said qedr was mmaping BAR pages that had some HW
>>> lifetime associated with them, and the HW resource was not to be
>>> reallocated until all users were gone.
>>> 
>>> I think it would be a better example of this API if you pulled the
>>> 
>>>    dev->ops->rdma_remove_user(dev->rdma_ctx, ctx->dpi);
>>> 
>>> Into qedr_mmap_free().
>>> 
>>> Then the rdma_user_mmap_entry_remove() will call it naturally as it
>>> does entry_put() and if we are destroying the ucontext we already know
>>> the mmaps are destroyed.
>>> 
>>> Maybe the same basic comment for EFA, not sure. Gal?
>> 
>> That's what EFA already does in this series, no?
>> We no longer remove entries on dealloc_ucontext, only when the entry is
>> freed.
> 
> Actually, I think most of the discussions you had on the topic were with Gal, but
> Some apply to qedr as well, however, for qedr, the only hw resource we allocate (bar)
> is on alloc_ucontext , therefore we were safe to free it on dealloc_ucontext as all mappings
> were already zapped. Making the mmap_free a bit redundant for qedr except for the need
> to free the entry. 
> 
> For EFA, it seemed the only operation delayed was freeing memory - I didn't see hw resources
> being freed... Gal?

What do you mean by hw resources being freed? The BAR mappings are under the device’s control and are associated to the lifetime of the UAR.
Michal Kalderon Sept. 24, 2019, 9:31 a.m. UTC | #7
> From: Pressman, Gal <galpress@amazon.com>
> Sent: Tuesday, September 24, 2019 11:50 AM
> 
> 
> > On 23 Sep 2019, at 18:22, Michal Kalderon <mkalderon@marvell.com>
> wrote:
> >
> > 
> >>
> >> From: linux-rdma-owner@vger.kernel.org <linux-rdma-
> >> owner@vger.kernel.org> On Behalf Of Gal Pressman
> >>
> >>> On 19/09/2019 20:55, Jason Gunthorpe wrote:
> >>> Huh. If you recall we did all this work with the XA and the free
> >>> callback because you said qedr was mmaping BAR pages that had some
> >>> HW lifetime associated with them, and the HW resource was not to be
> >>> reallocated until all users were gone.
> >>>
> >>> I think it would be a better example of this API if you pulled the
> >>>
> >>>    dev->ops->rdma_remove_user(dev->rdma_ctx, ctx->dpi);
> >>>
> >>> Into qedr_mmap_free().
> >>>
> >>> Then the rdma_user_mmap_entry_remove() will call it naturally as it
> >>> does entry_put() and if we are destroying the ucontext we already
> >>> know the mmaps are destroyed.
> >>>
> >>> Maybe the same basic comment for EFA, not sure. Gal?
> >>
> >> That's what EFA already does in this series, no?
> >> We no longer remove entries on dealloc_ucontext, only when the entry
> >> is freed.
> >
> > Actually, I think most of the discussions you had on the topic were
> > with Gal, but Some apply to qedr as well, however, for qedr, the only
> > hw resource we allocate (bar) is on alloc_ucontext , therefore we were
> > safe to free it on dealloc_ucontext as all mappings were already
> > zapped. Making the mmap_free a bit redundant for qedr except for the
> need to free the entry.
> >
> > For EFA, it seemed the only operation delayed was freeing memory - I
> > didn't see hw resources being freed... Gal?
> 
> What do you mean by hw resources being freed? The BAR mappings are
> under the device’s control and are associated to the lifetime of the UAR.
The bar offset you get is from the device -> don't you release it back to the device
So it can be allocated to a different application ? 
In efa_com_create_qp -> you get offsets from the device that you use for mapping
The bar -> are these unique for every call ? are they released during destroy_qp ? 
Before this patch series mmap_entries_remove_free only freed the DMA pages, but
Following this thread, it seemed the initial intention was that only the hw resources would
Be delayed as the DMA pages are ref counted anyway.  I didn't see any delay to returning
The bar offsets to the device. Thanks.
Gal Pressman Oct. 20, 2019, 7:19 a.m. UTC | #8
On 24/09/2019 12:31, Michal Kalderon wrote:
>> From: Pressman, Gal <galpress@amazon.com>
>> Sent: Tuesday, September 24, 2019 11:50 AM
>>
>>
>>> On 23 Sep 2019, at 18:22, Michal Kalderon <mkalderon@marvell.com>
>> wrote:
>>>
>>>
>>>>
>>>> From: linux-rdma-owner@vger.kernel.org <linux-rdma-
>>>> owner@vger.kernel.org> On Behalf Of Gal Pressman
>>>>
>>>>> On 19/09/2019 20:55, Jason Gunthorpe wrote:
>>>>> Huh. If you recall we did all this work with the XA and the free
>>>>> callback because you said qedr was mmaping BAR pages that had some
>>>>> HW lifetime associated with them, and the HW resource was not to be
>>>>> reallocated until all users were gone.
>>>>>
>>>>> I think it would be a better example of this API if you pulled the
>>>>>
>>>>>    dev->ops->rdma_remove_user(dev->rdma_ctx, ctx->dpi);
>>>>>
>>>>> Into qedr_mmap_free().
>>>>>
>>>>> Then the rdma_user_mmap_entry_remove() will call it naturally as it
>>>>> does entry_put() and if we are destroying the ucontext we already
>>>>> know the mmaps are destroyed.
>>>>>
>>>>> Maybe the same basic comment for EFA, not sure. Gal?
>>>>
>>>> That's what EFA already does in this series, no?
>>>> We no longer remove entries on dealloc_ucontext, only when the entry
>>>> is freed.
>>>
>>> Actually, I think most of the discussions you had on the topic were
>>> with Gal, but Some apply to qedr as well, however, for qedr, the only
>>> hw resource we allocate (bar) is on alloc_ucontext , therefore we were
>>> safe to free it on dealloc_ucontext as all mappings were already
>>> zapped. Making the mmap_free a bit redundant for qedr except for the
>> need to free the entry.
>>>
>>> For EFA, it seemed the only operation delayed was freeing memory - I
>>> didn't see hw resources being freed... Gal?
>>
>> What do you mean by hw resources being freed? The BAR mappings are
>> under the device’s control and are associated to the lifetime of the UAR.
> The bar offset you get is from the device -> don't you release it back to the device
> So it can be allocated to a different application ? 
> In efa_com_create_qp -> you get offsets from the device that you use for mapping
> The bar -> are these unique for every call ? are they released during destroy_qp ? 
> Before this patch series mmap_entries_remove_free only freed the DMA pages, but
> Following this thread, it seemed the initial intention was that only the hw resources would
> Be delayed as the DMA pages are ref counted anyway.  I didn't see any delay to returning
> The bar offsets to the device. Thanks.
The BAR pages are being "freed" back to the device once the UAR is freed.
These pages' lifetime is under the control of the device so there's nothing the
driver needs to do, just make sure no one else is using them.
Jason Gunthorpe Oct. 21, 2019, 5:33 p.m. UTC | #9
On Sun, Oct 20, 2019 at 10:19:34AM +0300, Gal Pressman wrote:
> On 24/09/2019 12:31, Michal Kalderon wrote:
> >> From: Pressman, Gal <galpress@amazon.com>
> >> Sent: Tuesday, September 24, 2019 11:50 AM
> >>
> >>
> >>> On 23 Sep 2019, at 18:22, Michal Kalderon <mkalderon@marvell.com>
> >> wrote:
> >>>
> >>>
> >>>>
> >>>> From: linux-rdma-owner@vger.kernel.org <linux-rdma-
> >>>> owner@vger.kernel.org> On Behalf Of Gal Pressman
> >>>>
> >>>>> On 19/09/2019 20:55, Jason Gunthorpe wrote:
> >>>>> Huh. If you recall we did all this work with the XA and the free
> >>>>> callback because you said qedr was mmaping BAR pages that had some
> >>>>> HW lifetime associated with them, and the HW resource was not to be
> >>>>> reallocated until all users were gone.
> >>>>>
> >>>>> I think it would be a better example of this API if you pulled the
> >>>>>
> >>>>>    dev->ops->rdma_remove_user(dev->rdma_ctx, ctx->dpi);
> >>>>>
> >>>>> Into qedr_mmap_free().
> >>>>>
> >>>>> Then the rdma_user_mmap_entry_remove() will call it naturally as it
> >>>>> does entry_put() and if we are destroying the ucontext we already
> >>>>> know the mmaps are destroyed.
> >>>>>
> >>>>> Maybe the same basic comment for EFA, not sure. Gal?
> >>>>
> >>>> That's what EFA already does in this series, no?
> >>>> We no longer remove entries on dealloc_ucontext, only when the entry
> >>>> is freed.
> >>>
> >>> Actually, I think most of the discussions you had on the topic were
> >>> with Gal, but Some apply to qedr as well, however, for qedr, the only
> >>> hw resource we allocate (bar) is on alloc_ucontext , therefore we were
> >>> safe to free it on dealloc_ucontext as all mappings were already
> >>> zapped. Making the mmap_free a bit redundant for qedr except for the
> >> need to free the entry.
> >>>
> >>> For EFA, it seemed the only operation delayed was freeing memory - I
> >>> didn't see hw resources being freed... Gal?
> >>
> >> What do you mean by hw resources being freed? The BAR mappings are
> >> under the device’s control and are associated to the lifetime of the UAR.
> > The bar offset you get is from the device -> don't you release it back to the device
> > So it can be allocated to a different application ? 
> > In efa_com_create_qp -> you get offsets from the device that you use for mapping
> > The bar -> are these unique for every call ? are they released during destroy_qp ? 
> > Before this patch series mmap_entries_remove_free only freed the DMA pages, but
> > Following this thread, it seemed the initial intention was that only the hw resources would
> > Be delayed as the DMA pages are ref counted anyway.  I didn't see any delay to returning
> > The bar offsets to the device. Thanks.
> The BAR pages are being "freed" back to the device once the UAR is freed.
> These pages' lifetime is under the control of the device so there's nothing the
> driver needs to do, just make sure no one else is using them.

What frees the UAR?

In the mlx drivers this was done during destruction of the ucontext,
but with this new mmap stuff it could be moved to the mmap_free..

Jason
Gal Pressman Oct. 23, 2019, 6:40 a.m. UTC | #10
On 21/10/2019 20:33, Jason Gunthorpe wrote:
> On Sun, Oct 20, 2019 at 10:19:34AM +0300, Gal Pressman wrote:
>> On 24/09/2019 12:31, Michal Kalderon wrote:
>>>> From: Pressman, Gal <galpress@amazon.com>
>>>> Sent: Tuesday, September 24, 2019 11:50 AM
>>>>
>>>>
>>>>> On 23 Sep 2019, at 18:22, Michal Kalderon <mkalderon@marvell.com>
>>>> wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> From: linux-rdma-owner@vger.kernel.org <linux-rdma-
>>>>>> owner@vger.kernel.org> On Behalf Of Gal Pressman
>>>>>>
>>>>>>> On 19/09/2019 20:55, Jason Gunthorpe wrote:
>>>>>>> Huh. If you recall we did all this work with the XA and the free
>>>>>>> callback because you said qedr was mmaping BAR pages that had some
>>>>>>> HW lifetime associated with them, and the HW resource was not to be
>>>>>>> reallocated until all users were gone.
>>>>>>>
>>>>>>> I think it would be a better example of this API if you pulled the
>>>>>>>
>>>>>>>    dev->ops->rdma_remove_user(dev->rdma_ctx, ctx->dpi);
>>>>>>>
>>>>>>> Into qedr_mmap_free().
>>>>>>>
>>>>>>> Then the rdma_user_mmap_entry_remove() will call it naturally as it
>>>>>>> does entry_put() and if we are destroying the ucontext we already
>>>>>>> know the mmaps are destroyed.
>>>>>>>
>>>>>>> Maybe the same basic comment for EFA, not sure. Gal?
>>>>>>
>>>>>> That's what EFA already does in this series, no?
>>>>>> We no longer remove entries on dealloc_ucontext, only when the entry
>>>>>> is freed.
>>>>>
>>>>> Actually, I think most of the discussions you had on the topic were
>>>>> with Gal, but Some apply to qedr as well, however, for qedr, the only
>>>>> hw resource we allocate (bar) is on alloc_ucontext , therefore we were
>>>>> safe to free it on dealloc_ucontext as all mappings were already
>>>>> zapped. Making the mmap_free a bit redundant for qedr except for the
>>>> need to free the entry.
>>>>>
>>>>> For EFA, it seemed the only operation delayed was freeing memory - I
>>>>> didn't see hw resources being freed... Gal?
>>>>
>>>> What do you mean by hw resources being freed? The BAR mappings are
>>>> under the device’s control and are associated to the lifetime of the UAR.
>>> The bar offset you get is from the device -> don't you release it back to the device
>>> So it can be allocated to a different application ? 
>>> In efa_com_create_qp -> you get offsets from the device that you use for mapping
>>> The bar -> are these unique for every call ? are they released during destroy_qp ? 
>>> Before this patch series mmap_entries_remove_free only freed the DMA pages, but
>>> Following this thread, it seemed the initial intention was that only the hw resources would
>>> Be delayed as the DMA pages are ref counted anyway.  I didn't see any delay to returning
>>> The bar offsets to the device. Thanks.
>> The BAR pages are being "freed" back to the device once the UAR is freed.
>> These pages' lifetime is under the control of the device so there's nothing the
>> driver needs to do, just make sure no one else is using them.
> 
> What frees the UAR?
> 
> In the mlx drivers this was done during destruction of the ucontext,
> but with this new mmap stuff it could be moved to the mmap_free..

Dealloc UAR is currently being called during dealloc_ucontext.
The mmap_free callback is per entry, how can dealloc_uar be moved there?
Jason Gunthorpe Oct. 23, 2019, 2:41 p.m. UTC | #11
On Wed, Oct 23, 2019 at 09:40:44AM +0300, Gal Pressman wrote:
> > In the mlx drivers this was done during destruction of the ucontext,
> > but with this new mmap stuff it could be moved to the mmap_free..
> 
> Dealloc UAR is currently being called during dealloc_ucontext.
> The mmap_free callback is per entry, how can dealloc_uar be moved there?

If it is some global per context uar then sure, it should live till
dealloc ucontext

If it is a dynamicly created uar then it should have a shorter
lifetime.

Jason
Gal Pressman Oct. 24, 2019, 8:06 a.m. UTC | #12
On 23/10/2019 17:41, Jason Gunthorpe wrote:
> On Wed, Oct 23, 2019 at 09:40:44AM +0300, Gal Pressman wrote:
>>> In the mlx drivers this was done during destruction of the ucontext,
>>> but with this new mmap stuff it could be moved to the mmap_free..
>>
>> Dealloc UAR is currently being called during dealloc_ucontext.
>> The mmap_free callback is per entry, how can dealloc_uar be moved there?
> 
> If it is some global per context uar then sure, it should live till
> dealloc ucontext
> 
> If it is a dynamicly created uar then it should have a shorter
> lifetime.

It's a UAR per ucontext. Created on alloc_ucontext and deallocated on
dealloc_ucontext.

Patch
diff mbox series

diff --git a/drivers/infiniband/hw/qedr/main.c b/drivers/infiniband/hw/qedr/main.c
index 5136b835e1ba..aab269602284 100644
--- a/drivers/infiniband/hw/qedr/main.c
+++ b/drivers/infiniband/hw/qedr/main.c
@@ -212,6 +212,7 @@  static const struct ib_device_ops qedr_dev_ops = {
 	.get_link_layer = qedr_link_layer,
 	.map_mr_sg = qedr_map_mr_sg,
 	.mmap = qedr_mmap,
+	.mmap_free = qedr_mmap_free,
 	.modify_port = qedr_modify_port,
 	.modify_qp = qedr_modify_qp,
 	.modify_srq = qedr_modify_srq,
diff --git a/drivers/infiniband/hw/qedr/qedr.h b/drivers/infiniband/hw/qedr/qedr.h
index 0cfd849b13d6..f273906fca19 100644
--- a/drivers/infiniband/hw/qedr/qedr.h
+++ b/drivers/infiniband/hw/qedr/qedr.h
@@ -230,14 +230,10 @@  struct qedr_ucontext {
 	struct qedr_dev *dev;
 	struct qedr_pd *pd;
 	void __iomem *dpi_addr;
+	u64 db_key;
 	u64 dpi_phys_addr;
 	u32 dpi_size;
 	u16 dpi;
-
-	struct list_head mm_head;
-
-	/* Lock to protect mm list */
-	struct mutex mm_list_lock;
 };
 
 union db_prod64 {
@@ -300,14 +296,6 @@  struct qedr_pd {
 	struct qedr_ucontext *uctx;
 };
 
-struct qedr_mm {
-	struct {
-		u64 phy_addr;
-		unsigned long len;
-	} key;
-	struct list_head entry;
-};
-
 union db_prod32 {
 	struct rdma_pwm_val16_data data;
 	u32 raw;
@@ -476,6 +464,13 @@  struct qedr_mr {
 	u32 npages;
 };
 
+struct qedr_user_mmap_entry {
+	struct rdma_user_mmap_entry rdma_entry;
+	u64 address;
+	size_t length;
+	u8 mmap_flag;
+};
+
 #define SET_FIELD2(value, name, flag) ((value) |= ((flag) << (name ## _SHIFT)))
 
 #define QEDR_RESP_IMM	(RDMA_CQE_RESPONDER_IMM_FLG_MASK << \
@@ -574,4 +569,11 @@  static inline struct qedr_srq *get_qedr_srq(struct ib_srq *ibsrq)
 {
 	return container_of(ibsrq, struct qedr_srq, ibsrq);
 }
+
+static inline struct qedr_user_mmap_entry *
+get_qedr_mmap_entry(struct rdma_user_mmap_entry *rdma_entry)
+{
+	return container_of(rdma_entry, struct qedr_user_mmap_entry,
+			    rdma_entry);
+}
 #endif
diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
index 6f3ce86019b7..b796ce2c78bc 100644
--- a/drivers/infiniband/hw/qedr/verbs.c
+++ b/drivers/infiniband/hw/qedr/verbs.c
@@ -58,6 +58,10 @@ 
 
 #define DB_ADDR_SHIFT(addr)		((addr) << DB_PWM_ADDR_OFFSET_SHIFT)
 
+enum {
+	QEDR_USER_MMAP_IO_WC = 0,
+};
+
 static inline int qedr_ib_copy_to_udata(struct ib_udata *udata, void *src,
 					size_t len)
 {
@@ -256,60 +260,6 @@  int qedr_modify_port(struct ib_device *ibdev, u8 port, int mask,
 	return 0;
 }
 
-static int qedr_add_mmap(struct qedr_ucontext *uctx, u64 phy_addr,
-			 unsigned long len)
-{
-	struct qedr_mm *mm;
-
-	mm = kzalloc(sizeof(*mm), GFP_KERNEL);
-	if (!mm)
-		return -ENOMEM;
-
-	mm->key.phy_addr = phy_addr;
-	/* This function might be called with a length which is not a multiple
-	 * of PAGE_SIZE, while the mapping is PAGE_SIZE grained and the kernel
-	 * forces this granularity by increasing the requested size if needed.
-	 * When qedr_mmap is called, it will search the list with the updated
-	 * length as a key. To prevent search failures, the length is rounded up
-	 * in advance to PAGE_SIZE.
-	 */
-	mm->key.len = roundup(len, PAGE_SIZE);
-	INIT_LIST_HEAD(&mm->entry);
-
-	mutex_lock(&uctx->mm_list_lock);
-	list_add(&mm->entry, &uctx->mm_head);
-	mutex_unlock(&uctx->mm_list_lock);
-
-	DP_DEBUG(uctx->dev, QEDR_MSG_MISC,
-		 "added (addr=0x%llx,len=0x%lx) for ctx=%p\n",
-		 (unsigned long long)mm->key.phy_addr,
-		 (unsigned long)mm->key.len, uctx);
-
-	return 0;
-}
-
-static bool qedr_search_mmap(struct qedr_ucontext *uctx, u64 phy_addr,
-			     unsigned long len)
-{
-	bool found = false;
-	struct qedr_mm *mm;
-
-	mutex_lock(&uctx->mm_list_lock);
-	list_for_each_entry(mm, &uctx->mm_head, entry) {
-		if (len != mm->key.len || phy_addr != mm->key.phy_addr)
-			continue;
-
-		found = true;
-		break;
-	}
-	mutex_unlock(&uctx->mm_list_lock);
-	DP_DEBUG(uctx->dev, QEDR_MSG_MISC,
-		 "searched for (addr=0x%llx,len=0x%lx) for ctx=%p, result=%d\n",
-		 mm->key.phy_addr, mm->key.len, uctx, found);
-
-	return found;
-}
-
 int qedr_alloc_ucontext(struct ib_ucontext *uctx, struct ib_udata *udata)
 {
 	struct ib_device *ibdev = uctx->device;
@@ -318,6 +268,7 @@  int qedr_alloc_ucontext(struct ib_ucontext *uctx, struct ib_udata *udata)
 	struct qedr_alloc_ucontext_resp uresp = {};
 	struct qedr_dev *dev = get_qedr_dev(ibdev);
 	struct qed_rdma_add_user_out_params oparams;
+	struct qedr_user_mmap_entry *entry;
 
 	if (!udata)
 		return -EFAULT;
@@ -334,13 +285,27 @@  int qedr_alloc_ucontext(struct ib_ucontext *uctx, struct ib_udata *udata)
 	ctx->dpi_addr = oparams.dpi_addr;
 	ctx->dpi_phys_addr = oparams.dpi_phys_addr;
 	ctx->dpi_size = oparams.dpi_size;
-	INIT_LIST_HEAD(&ctx->mm_head);
-	mutex_init(&ctx->mm_list_lock);
+	entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+	if (!entry) {
+		rc = -ENOMEM;
+		goto err;
+	}
+
+	entry->address = ctx->dpi_phys_addr;
+	entry->length = ctx->dpi_size;
+	entry->mmap_flag = QEDR_USER_MMAP_IO_WC;
+	ctx->db_key = rdma_user_mmap_entry_insert(uctx, &entry->rdma_entry,
+						  ctx->dpi_size);
+	if (ctx->db_key == RDMA_USER_MMAP_INVALID) {
+		rc = -ENOMEM;
+		kfree(entry);
+		goto err;
+	}
 
 	uresp.dpm_enabled = dev->user_dpm_enabled;
 	uresp.wids_enabled = 1;
 	uresp.wid_count = oparams.wid_count;
-	uresp.db_pa = ctx->dpi_phys_addr;
+	uresp.db_pa = ctx->db_key;
 	uresp.db_size = ctx->dpi_size;
 	uresp.max_send_wr = dev->attr.max_sqe;
 	uresp.max_recv_wr = dev->attr.max_rqe;
@@ -352,82 +317,107 @@  int qedr_alloc_ucontext(struct ib_ucontext *uctx, struct ib_udata *udata)
 
 	rc = qedr_ib_copy_to_udata(udata, &uresp, sizeof(uresp));
 	if (rc)
-		return rc;
+		goto err1;
 
 	ctx->dev = dev;
 
-	rc = qedr_add_mmap(ctx, ctx->dpi_phys_addr, ctx->dpi_size);
-	if (rc)
-		return rc;
-
 	DP_DEBUG(dev, QEDR_MSG_INIT, "Allocating user context %p\n",
 		 &ctx->ibucontext);
 	return 0;
+
+err1:
+	rdma_user_mmap_entry_remove(uctx, ctx->db_key);
+err:
+	dev->ops->rdma_remove_user(dev->rdma_ctx, ctx->dpi);
+	return rc;
 }
 
 void qedr_dealloc_ucontext(struct ib_ucontext *ibctx)
 {
 	struct qedr_ucontext *uctx = get_qedr_ucontext(ibctx);
-	struct qedr_mm *mm, *tmp;
 
 	DP_DEBUG(uctx->dev, QEDR_MSG_INIT, "Deallocating user context %p\n",
 		 uctx);
-	uctx->dev->ops->rdma_remove_user(uctx->dev->rdma_ctx, uctx->dpi);
 
-	list_for_each_entry_safe(mm, tmp, &uctx->mm_head, entry) {
-		DP_DEBUG(uctx->dev, QEDR_MSG_MISC,
-			 "deleted (addr=0x%llx,len=0x%lx) for ctx=%p\n",
-			 mm->key.phy_addr, mm->key.len, uctx);
-		list_del(&mm->entry);
-		kfree(mm);
-	}
+	rdma_user_mmap_entry_remove(ibctx, uctx->db_key);
+	uctx->dev->ops->rdma_remove_user(uctx->dev->rdma_ctx, uctx->dpi);
 }
 
-int qedr_mmap(struct ib_ucontext *context, struct vm_area_struct *vma)
+void qedr_mmap_free(struct rdma_user_mmap_entry *rdma_entry)
 {
-	struct qedr_ucontext *ucontext = get_qedr_ucontext(context);
-	struct qedr_dev *dev = get_qedr_dev(context->device);
-	unsigned long phys_addr = vma->vm_pgoff << PAGE_SHIFT;
-	unsigned long len = (vma->vm_end - vma->vm_start);
-	unsigned long dpi_start;
+	struct qedr_user_mmap_entry *entry = get_qedr_mmap_entry(rdma_entry);
 
-	dpi_start = dev->db_phys_addr + (ucontext->dpi * ucontext->dpi_size);
-
-	DP_DEBUG(dev, QEDR_MSG_INIT,
-		 "mmap invoked with vm_start=0x%pK, vm_end=0x%pK,vm_pgoff=0x%pK; dpi_start=0x%pK dpi_size=0x%x\n",
-		 (void *)vma->vm_start, (void *)vma->vm_end,
-		 (void *)vma->vm_pgoff, (void *)dpi_start, ucontext->dpi_size);
+	kfree(entry);
+}
 
-	if ((vma->vm_start & (PAGE_SIZE - 1)) || (len & (PAGE_SIZE - 1))) {
-		DP_ERR(dev,
-		       "failed mmap, addresses must be page aligned: start=0x%pK, end=0x%pK\n",
-		       (void *)vma->vm_start, (void *)vma->vm_end);
+int qedr_mmap(struct ib_ucontext *ucontext, struct vm_area_struct *vma)
+{
+	struct ib_device *dev = ucontext->device;
+	size_t length = vma->vm_end - vma->vm_start;
+	u64 key = vma->vm_pgoff << PAGE_SHIFT;
+	struct rdma_user_mmap_entry *rdma_entry;
+	struct qedr_user_mmap_entry *entry;
+	u64 pfn;
+	int err;
+
+	ibdev_dbg(dev,
+		  "start %#lx, end %#lx, length = %#zx, key = %#llx\n",
+		  vma->vm_start, vma->vm_end, length, key);
+
+	if (length % PAGE_SIZE != 0 || !(vma->vm_flags & VM_SHARED)) {
+		ibdev_dbg(dev,
+			  "length[%#zx] is not page size aligned[%#lx] or VM_SHARED is not set [%#lx]\n",
+			  length, PAGE_SIZE, vma->vm_flags);
 		return -EINVAL;
 	}
 
-	if (!qedr_search_mmap(ucontext, phys_addr, len)) {
-		DP_ERR(dev, "failed mmap, vm_pgoff=0x%lx is not authorized\n",
-		       vma->vm_pgoff);
-		return -EINVAL;
+	if (vma->vm_flags & VM_EXEC) {
+		ibdev_dbg(dev, "Mapping executable pages is not permitted\n");
+		return -EPERM;
 	}
+	vma->vm_flags &= ~VM_MAYEXEC;
 
-	if (phys_addr < dpi_start ||
-	    ((phys_addr + len) > (dpi_start + ucontext->dpi_size))) {
-		DP_ERR(dev,
-		       "failed mmap, pages are outside of dpi; page address=0x%pK, dpi_start=0x%pK, dpi_size=0x%x\n",
-		       (void *)phys_addr, (void *)dpi_start,
-		       ucontext->dpi_size);
+	rdma_entry = rdma_user_mmap_entry_get(ucontext, key, length, vma);
+	if (!rdma_entry) {
+		ibdev_dbg(dev, "key[%#llx] does not have valid entry\n",
+			  key);
 		return -EINVAL;
 	}
+	entry = get_qedr_mmap_entry(rdma_entry);
+	if (entry->length != length) {
+		ibdev_dbg(dev,
+			  "key[%#llx] does not have valid length[%#zx] expected[%#zx]\n",
+			  key, length, entry->length);
+		err = -EINVAL;
+		goto err;
+	}
 
-	if (vma->vm_flags & VM_READ) {
-		DP_ERR(dev, "failed mmap, cannot map doorbell bar for read\n");
-		return -EINVAL;
+	ibdev_dbg(dev,
+		  "Mapping address[%#llx], length[%#zx], mmap_flag[%d]\n",
+		  entry->address, length, entry->mmap_flag);
+
+	pfn = entry->address >> PAGE_SHIFT;
+	switch (entry->mmap_flag) {
+	case QEDR_USER_MMAP_IO_WC:
+		err = rdma_user_mmap_io(ucontext, vma, pfn, length,
+					pgprot_writecombine(vma->vm_page_prot));
+		break;
+	default:
+		err = -EINVAL;
+	}
+
+	if (err) {
+		ibdev_dbg(dev,
+			  "Couldn't mmap address[%#llx] length[%#zx] mmap_flag[%d] err[%d]\n",
+			  entry->address, length, entry->mmap_flag, err);
+		goto err;
 	}
 
-	vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
-	return io_remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, len,
-				  vma->vm_page_prot);
+	return 0;
+
+err:
+	rdma_user_mmap_entry_put(ucontext, rdma_entry);
+	return err;
 }
 
 int qedr_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
diff --git a/drivers/infiniband/hw/qedr/verbs.h b/drivers/infiniband/hw/qedr/verbs.h
index 9aaa90283d6e..3606e97e95da 100644
--- a/drivers/infiniband/hw/qedr/verbs.h
+++ b/drivers/infiniband/hw/qedr/verbs.h
@@ -46,7 +46,8 @@  int qedr_query_pkey(struct ib_device *, u8 port, u16 index, u16 *pkey);
 int qedr_alloc_ucontext(struct ib_ucontext *uctx, struct ib_udata *udata);
 void qedr_dealloc_ucontext(struct ib_ucontext *uctx);
 
-int qedr_mmap(struct ib_ucontext *, struct vm_area_struct *vma);
+int qedr_mmap(struct ib_ucontext *ucontext, struct vm_area_struct *vma);
+void qedr_mmap_free(struct rdma_user_mmap_entry *rdma_entry);
 int qedr_alloc_pd(struct ib_pd *pd, struct ib_udata *udata);
 void qedr_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata);