[3/4] libceph: use kvmalloc to allocate page vector

Message ID	20180928094355.6049-3-zyan@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <ceph-devel-owner@kernel.org> From: "Yan, Zheng" <zyan@redhat.com> To: ceph-devel@vger.kernel.org Cc: idryomov@redhat.com, jlayton@redhat.com, "Yan, Zheng" <zyan@redhat.com> Subject: [PATCH 3/4] libceph: use kvmalloc to allocate page vector Date: Fri, 28 Sep 2018 17:43:54 +0800 Message-Id: <20180928094355.6049-3-zyan@redhat.com> In-Reply-To: <20180928094355.6049-1-zyan@redhat.com> References: <20180928094355.6049-1-zyan@redhat.com> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk
Series	[1/4] Revert "ceph: fix dentry leak in splice_dentry()" \| expand [1/4] Revert "ceph: fix dentry leak in splice_dentry()" [3/4] libceph: use kvmalloc to allocate page vector [4/4] ceph: check if LOOKUPNAME request was aborted when filling trace

Message ID

20180928094355.6049-3-zyan@redhat.com (mailing list archive)

State

New, archived

Headers

From: "Yan, Zheng" <zyan@redhat.com>
To: ceph-devel@vger.kernel.org
Cc: idryomov@redhat.com, jlayton@redhat.com,
        "Yan, Zheng" <zyan@redhat.com>
Subject: [PATCH 3/4] libceph: use kvmalloc to allocate page vector
Date: Fri, 28 Sep 2018 17:43:54 +0800
Message-Id: <20180928094355.6049-3-zyan@redhat.com>
In-Reply-To: <20180928094355.6049-1-zyan@redhat.com>
References: <20180928094355.6049-1-zyan@redhat.com>
Sender: ceph-devel-owner@vger.kernel.org
Precedence: bulk

Series

[1/4] Revert "ceph: fix dentry leak in splice_dentry()" | expand

Comments

Ilya Dryomov Sept. 28, 2018, 11:47 a.m. UTC | #1

On Fri, Sep 28, 2018 at 11:45 AM Yan, Zheng <zyan@redhat.com> wrote:
>
> large read may require allocating large page vector
>
> Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
> ---
>  net/ceph/pagevec.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/ceph/pagevec.c b/net/ceph/pagevec.c
> index d3736f5bffec..b1c47ba0c38a 100644
> --- a/net/ceph/pagevec.c
> +++ b/net/ceph/pagevec.c
> @@ -62,7 +62,7 @@ void ceph_release_page_vector(struct page **pages, int num_pages)
>
>         for (i = 0; i < num_pages; i++)
>                 __free_pages(pages[i], 0);
> -       kfree(pages);
> +       kvfree(pages);
>  }
>  EXPORT_SYMBOL(ceph_release_page_vector);
>
> @@ -74,7 +74,7 @@ struct page **ceph_alloc_page_vector(int num_pages, gfp_t flags)
>         struct page **pages;
>         int i;
>
> -       pages = kmalloc_array(num_pages, sizeof(*pages), flags);
> +       pages = kvmalloc_array(num_pages, sizeof(*pages), flags);
>         if (!pages)
>                 return ERR_PTR(-ENOMEM);
>         for (i = 0; i < num_pages; i++) {

Have you considered rewriting ceph_sync_read()?  In the end, it does
a synchronous OSD request per object, it's not atomic in any way, so in
theory it shouldn't need more than object_size worth of pages.

Fengguang's application is doing ~100M reads, and the one he reported
was likely a ~2G read.  The page pointer array itself is only part of
problem, allocating 2G worth of pages when we don't actually need all
of them at once is wrong.

Thanks,

                Ilya

Yan, Zheng Sept. 28, 2018, 12:58 p.m. UTC | #2

On Fri, Sep 28, 2018 at 7:50 PM Ilya Dryomov <idryomov@gmail.com> wrote:
>
> On Fri, Sep 28, 2018 at 11:45 AM Yan, Zheng <zyan@redhat.com> wrote:
> >
> > large read may require allocating large page vector
> >
> > Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
> > ---
> >  net/ceph/pagevec.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/ceph/pagevec.c b/net/ceph/pagevec.c
> > index d3736f5bffec..b1c47ba0c38a 100644
> > --- a/net/ceph/pagevec.c
> > +++ b/net/ceph/pagevec.c
> > @@ -62,7 +62,7 @@ void ceph_release_page_vector(struct page **pages, int num_pages)
> >
> >         for (i = 0; i < num_pages; i++)
> >                 __free_pages(pages[i], 0);
> > -       kfree(pages);
> > +       kvfree(pages);
> >  }
> >  EXPORT_SYMBOL(ceph_release_page_vector);
> >
> > @@ -74,7 +74,7 @@ struct page **ceph_alloc_page_vector(int num_pages, gfp_t flags)
> >         struct page **pages;
> >         int i;
> >
> > -       pages = kmalloc_array(num_pages, sizeof(*pages), flags);
> > +       pages = kvmalloc_array(num_pages, sizeof(*pages), flags);
> >         if (!pages)
> >                 return ERR_PTR(-ENOMEM);
> >         for (i = 0; i < num_pages; i++) {
>
> Have you considered rewriting ceph_sync_read()?  In the end, it does
> a synchronous OSD request per object, it's not atomic in any way, so in
> theory it shouldn't need more than object_size worth of pages.
>

Yes, it's a better. But I'm a little busy recently. No idea when I
will have time.

Regards
Yan, Zheng

> Fengguang's application is doing ~100M reads, and the one he reported
> was likely a ~2G read.  The page pointer array itself is only part of
> problem, allocating 2G worth of pages when we don't actually need all
> of them at once is wrong.
>
> Thanks,
>
>                 Ilya

Jeff Layton Oct. 11, 2018, 3:10 p.m. UTC | #3

On Fri, 2018-09-28 at 17:43 +0800, Yan, Zheng wrote:
> large read may require allocating large page vector
> 
> Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
> ---
>  net/ceph/pagevec.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ceph/pagevec.c b/net/ceph/pagevec.c
> index d3736f5bffec..b1c47ba0c38a 100644
> --- a/net/ceph/pagevec.c
> +++ b/net/ceph/pagevec.c
> @@ -62,7 +62,7 @@ void ceph_release_page_vector(struct page **pages, int num_pages)
>  
>  	for (i = 0; i < num_pages; i++)
>  		__free_pages(pages[i], 0);
> -	kfree(pages);
> +	kvfree(pages);
>  }
>  EXPORT_SYMBOL(ceph_release_page_vector);
>  
> @@ -74,7 +74,7 @@ struct page **ceph_alloc_page_vector(int num_pages, gfp_t flags)
>  	struct page **pages;
>  	int i;
>  
> -	pages = kmalloc_array(num_pages, sizeof(*pages), flags);
> +	pages = kvmalloc_array(num_pages, sizeof(*pages), flags);
>  	if (!pages)
>  		return ERR_PTR(-ENOMEM);
>  	for (i = 0; i < num_pages; i++) {

Reviewed-by: Jeff Layton <jlayton@redhat.com>

diff --git a/net/ceph/pagevec.c b/net/ceph/pagevec.c
index d3736f5bffec..b1c47ba0c38a 100644
--- a/net/ceph/pagevec.c
+++ b/net/ceph/pagevec.c
@@ -62,7 +62,7 @@  void ceph_release_page_vector(struct page **pages, int num_pages)
 
 	for (i = 0; i < num_pages; i++)
 		__free_pages(pages[i], 0);
-	kfree(pages);
+	kvfree(pages);
 }
 EXPORT_SYMBOL(ceph_release_page_vector);
 
@@ -74,7 +74,7 @@  struct page **ceph_alloc_page_vector(int num_pages, gfp_t flags)
 	struct page **pages;
 	int i;
 
-	pages = kmalloc_array(num_pages, sizeof(*pages), flags);
+	pages = kvmalloc_array(num_pages, sizeof(*pages), flags);
 	if (!pages)
 		return ERR_PTR(-ENOMEM);
 	for (i = 0; i < num_pages; i++) {

[3/4] libceph: use kvmalloc to allocate page vector

Commit Message

Comments

Patch