Message ID | 20221120012939.539953-1-yanjun.zhu@intel.com (mailing list archive) |
---|---|
State | Rejected |
Delegated to: | Jason Gunthorpe |
Headers | show |
Series | [for-next,1/1] RDMA/rxe: sgt_append from ib_umem_get is not highmem | expand |
There was a thread that tries to refactor iova_to_vaddr[1][2], where page_address will be drop. if so, your changes will be no longer needed. [1]https://lore.kernel.org/lkml/a7decec2-77d9-db4c-ff66-ff597da8bc71@fujitsu.com/T/ [2]https://www.spinics.net/lists/linux-rdma/msg114053.html Thanks Zhijian On 20/11/2022 09:29, Zhu Yanjun wrote: > From: Zhu Yanjun <yanjun.zhu@linux.dev> > > In ib_umem_get, sgt_append is allocated from the function > sg_alloc_append_table_from_pages. And it is not from highmem. > > As such, the return value of page_address will not be NULL. > > Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> > --- > drivers/infiniband/sw/rxe/rxe_mr.c | 9 ++------- > 1 file changed, 2 insertions(+), 7 deletions(-) > > diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c > index b1423000e4bc..3f33a4cdef24 100644 > --- a/drivers/infiniband/sw/rxe/rxe_mr.c > +++ b/drivers/infiniband/sw/rxe/rxe_mr.c > @@ -119,7 +119,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova, > struct ib_umem *umem; > struct sg_page_iter sg_iter; > int num_buf; > - void *vaddr; > int err; > > umem = ib_umem_get(&rxe->ib_dev, start, length, access); > @@ -149,6 +148,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova, > buf = map[0]->buf; > > for_each_sgtable_page (&umem->sgt_append.sgt, &sg_iter, 0) { > + void *vaddr; > + > if (num_buf >= RXE_BUF_PER_MAP) { > map++; > buf = map[0]->buf; > @@ -156,16 +157,10 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova, > } > > vaddr = page_address(sg_page_iter_page(&sg_iter)); > - if (!vaddr) { > - rxe_dbg_mr(mr, "Unable to get virtual address\n"); > - err = -ENOMEM; > - goto err_release_umem; > - } > buf->addr = (uintptr_t)vaddr; > buf->size = PAGE_SIZE; > num_buf++; > buf++; > - > } > } >
在 2022/11/19 18:30, lizhijian@fujitsu.com 写道: > > There was a thread that tries to refactor iova_to_vaddr[1][2], where > page_address will be drop. if so, your changes will be no longer needed. My commit is to indicate that your commit is not good enough. And you should make all the related commits in a patch series. Zhu Yanjun > > [1]https://lore.kernel.org/lkml/a7decec2-77d9-db4c-ff66-ff597da8bc71@fujitsu.com/T/ > [2]https://www.spinics.net/lists/linux-rdma/msg114053.html > > Thanks > Zhijian > > > On 20/11/2022 09:29, Zhu Yanjun wrote: >> From: Zhu Yanjun <yanjun.zhu@linux.dev> >> >> In ib_umem_get, sgt_append is allocated from the function >> sg_alloc_append_table_from_pages. And it is not from highmem. >> >> As such, the return value of page_address will not be NULL. >> >> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> >> --- >> drivers/infiniband/sw/rxe/rxe_mr.c | 9 ++------- >> 1 file changed, 2 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c >> index b1423000e4bc..3f33a4cdef24 100644 >> --- a/drivers/infiniband/sw/rxe/rxe_mr.c >> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c >> @@ -119,7 +119,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova, >> struct ib_umem *umem; >> struct sg_page_iter sg_iter; >> int num_buf; >> - void *vaddr; >> int err; >> >> umem = ib_umem_get(&rxe->ib_dev, start, length, access); >> @@ -149,6 +148,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova, >> buf = map[0]->buf; >> >> for_each_sgtable_page (&umem->sgt_append.sgt, &sg_iter, 0) { >> + void *vaddr; >> + >> if (num_buf >= RXE_BUF_PER_MAP) { >> map++; >> buf = map[0]->buf; >> @@ -156,16 +157,10 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova, >> } >> >> vaddr = page_address(sg_page_iter_page(&sg_iter)); >> - if (!vaddr) { >> - rxe_dbg_mr(mr, "Unable to get virtual address\n"); >> - err = -ENOMEM; >> - goto err_release_umem; >> - } >> buf->addr = (uintptr_t)vaddr; >> buf->size = PAGE_SIZE; >> num_buf++; >> buf++; >> - >> } >> } >>
On Sat, Nov 19, 2022 at 08:29:38PM -0500, Zhu Yanjun wrote: > From: Zhu Yanjun <yanjun.zhu@linux.dev> > > In ib_umem_get, sgt_append is allocated from the function > sg_alloc_append_table_from_pages. And it is not from highmem. You've confused the allocation of the SGL table itself with the page_address called on the struct page * stored inside the SGL table. Think of the SGL as an array of 'struct page *' The page_address() can return NULL because the 'struct page *' it contains came from userspace and could very will be highmem. Jason
在 2022/11/21 23:28, Jason Gunthorpe 写道: > On Sat, Nov 19, 2022 at 08:29:38PM -0500, Zhu Yanjun wrote: >> From: Zhu Yanjun <yanjun.zhu@linux.dev> >> >> In ib_umem_get, sgt_append is allocated from the function >> sg_alloc_append_table_from_pages. And it is not from highmem. > > You've confused the allocation of the SGL table itself with the > page_address called on the struct page * stored inside the SGL table. > > Think of the SGL as an array of 'struct page *' > About "The page_address() can return NULL because the 'struct page *' it contains came from userspace and could very will be highmem.", From the function ib_umem *ib_umem_get(struct ib_device *device, unsigned long addr,size_t size, int access), I agree with you that struct page comes from user space. But from "Understanding the Linux Kernel", third edition - sections "8.1.3. Memory Zones" and "8.1.6. Kernel Mappings of High-Memory Page Frames". In the process' virtual address space, the user space occupies the first 3GB, and the kernel space the 4th GB of this linear address space. The first 896MB of the kernel space (not only kernel code, but its data also) is "directly" mapped to the first 896 MB of the physical memory. The last 128MB part of the virtual kernel space is where are mapped some pieces of the physical "high memory" (> 896MB) : thus it can only map no more than 128MB of "high memory" at a time. It seems that page_address of these "128MB high memory" will return NULL. But can user space access these high memory? From "Understanding the Linux Kernel", third edition, it seems that it is in kernel space. Thanks and Regards, Zhu Yanjun > > Jason
On Tue, Nov 22, 2022 at 12:07:06PM +0800, Yanjun Zhu wrote: > But can user space access these high memory? From "Understanding the Linux > Kernel", third edition, it seems that it is in kernel space. Yes, "highmem" is effecitvely the ability to have a struct page * that is not mapped into the kernel address space, but is mapped into a userspace process address sspace. Jason
在 2022/11/22 22:06, Jason Gunthorpe 写道: > On Tue, Nov 22, 2022 at 12:07:06PM +0800, Yanjun Zhu wrote: > >> But can user space access these high memory? From "Understanding the Linux >> Kernel", third edition, it seems that it is in kernel space. > Yes, "highmem" is effecitvely the ability to have a struct page * that > is not mapped into the kernel address space, but is mapped into a > userspace process address sspace. Got it. I am interested in this "highmem is mapped into a userspace process address space.". Would you like to share some documents, links or source code about this with me? I want to delve into this "highmem is mapped into a userspace process address space." Thanks and Regards, Zhu Yanjun > > Jason
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c index b1423000e4bc..3f33a4cdef24 100644 --- a/drivers/infiniband/sw/rxe/rxe_mr.c +++ b/drivers/infiniband/sw/rxe/rxe_mr.c @@ -119,7 +119,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova, struct ib_umem *umem; struct sg_page_iter sg_iter; int num_buf; - void *vaddr; int err; umem = ib_umem_get(&rxe->ib_dev, start, length, access); @@ -149,6 +148,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova, buf = map[0]->buf; for_each_sgtable_page (&umem->sgt_append.sgt, &sg_iter, 0) { + void *vaddr; + if (num_buf >= RXE_BUF_PER_MAP) { map++; buf = map[0]->buf; @@ -156,16 +157,10 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova, } vaddr = page_address(sg_page_iter_page(&sg_iter)); - if (!vaddr) { - rxe_dbg_mr(mr, "Unable to get virtual address\n"); - err = -ENOMEM; - goto err_release_umem; - } buf->addr = (uintptr_t)vaddr; buf->size = PAGE_SIZE; num_buf++; buf++; - } }