diff mbox series

[1/1] libceph: fix memory leak for messages allocated with CEPH_MSG_DATA_PAGES

Message ID 20200310090924.49788-1-rpenyaev@suse.de (mailing list archive)
State New, archived
Headers show
Series [1/1] libceph: fix memory leak for messages allocated with CEPH_MSG_DATA_PAGES | expand

Commit Message

Roman Penyaev March 10, 2020, 9:09 a.m. UTC
OSD client allocates a message with a page vector for OSD_MAP, OSD_BACKOFF
and WATCH_NOTIFY message types (see alloc_msg_with_page_vector() caller),
but pages vector release is never called.

Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Sage Weil <sage@redhat.com>
Cc: ceph-devel@vger.kernel.org
---
 net/ceph/messenger.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Ilya Dryomov March 10, 2020, 10:03 a.m. UTC | #1
On Tue, Mar 10, 2020 at 10:09 AM Roman Penyaev <rpenyaev@suse.de> wrote:
>
> OSD client allocates a message with a page vector for OSD_MAP, OSD_BACKOFF
> and WATCH_NOTIFY message types (see alloc_msg_with_page_vector() caller),
> but pages vector release is never called.
>
> Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
> Cc: Ilya Dryomov <idryomov@gmail.com>
> Cc: Jeff Layton <jlayton@kernel.org>
> Cc: Sage Weil <sage@redhat.com>
> Cc: ceph-devel@vger.kernel.org
> ---
>  net/ceph/messenger.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 5b4bd8261002..28cbd55ec2e3 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -3248,8 +3248,15 @@ static struct ceph_msg_data *ceph_msg_data_add(struct ceph_msg *msg)
>
>  static void ceph_msg_data_destroy(struct ceph_msg_data *data)
>  {
> -       if (data->type == CEPH_MSG_DATA_PAGELIST)
> +       if (data->type == CEPH_MSG_DATA_PAGES) {
> +               int num_pages;
> +
> +               num_pages = calc_pages_for(data->alignment,
> +                                          data->length);
> +               ceph_release_page_vector(data->pages, num_pages);
> +       } else if (data->type == CEPH_MSG_DATA_PAGELIST) {
>                 ceph_pagelist_release(data->pagelist);
> +       }
>  }
>
>  void ceph_msg_data_add_pages(struct ceph_msg *msg, struct page **pages,

Hi Roman,

I don't think there is a leak here.

osdmap and backoff messages don't have data.

watch_notify message may or may not have data and this is dealt
with in handle_watch_notify().  The pages are either released in
handle_watch_notify() or transferred to ceph_osdc_notify() through
lreq.  The caller of ceph_osdc_notify() is then responsible for
them:

   * @preply_{pages,len} are initialized both on success and error.
   * The caller is responsible for:
   *
   *     ceph_release_page_vector(...)
   */
  int ceph_osdc_notify(struct ceph_osd_client *osdc,

Thanks,

                Ilya
Roman Penyaev March 10, 2020, 10:15 a.m. UTC | #2
On 2020-03-10 11:03, Ilya Dryomov wrote:
> On Tue, Mar 10, 2020 at 10:09 AM Roman Penyaev <rpenyaev@suse.de> 
> wrote:
>> 
>> OSD client allocates a message with a page vector for OSD_MAP, 
>> OSD_BACKOFF
>> and WATCH_NOTIFY message types (see alloc_msg_with_page_vector() 
>> caller),
>> but pages vector release is never called.
>> 
>> Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
>> Cc: Ilya Dryomov <idryomov@gmail.com>
>> Cc: Jeff Layton <jlayton@kernel.org>
>> Cc: Sage Weil <sage@redhat.com>
>> Cc: ceph-devel@vger.kernel.org
>> ---
>>  net/ceph/messenger.c | 9 ++++++++-
>>  1 file changed, 8 insertions(+), 1 deletion(-)
>> 
>> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
>> index 5b4bd8261002..28cbd55ec2e3 100644
>> --- a/net/ceph/messenger.c
>> +++ b/net/ceph/messenger.c
>> @@ -3248,8 +3248,15 @@ static struct ceph_msg_data 
>> *ceph_msg_data_add(struct ceph_msg *msg)
>> 
>>  static void ceph_msg_data_destroy(struct ceph_msg_data *data)
>>  {
>> -       if (data->type == CEPH_MSG_DATA_PAGELIST)
>> +       if (data->type == CEPH_MSG_DATA_PAGES) {
>> +               int num_pages;
>> +
>> +               num_pages = calc_pages_for(data->alignment,
>> +                                          data->length);
>> +               ceph_release_page_vector(data->pages, num_pages);
>> +       } else if (data->type == CEPH_MSG_DATA_PAGELIST) {
>>                 ceph_pagelist_release(data->pagelist);
>> +       }
>>  }
>> 
>>  void ceph_msg_data_add_pages(struct ceph_msg *msg, struct page 
>> **pages,
> 
> Hi Roman,
> 
> I don't think there is a leak here.
> 
> osdmap and backoff messages don't have data.

I tried to be symmetric on this path: allocation path
exists, but there is no deallocation.

> watch_notify message may or may not have data and this is dealt
> with in handle_watch_notify().  The pages are either released in
> handle_watch_notify() or transferred to ceph_osdc_notify() through
> lreq.  The caller of ceph_osdc_notify() is then responsible for
> them:
> 
>    * @preply_{pages,len} are initialized both on success and error.
>    * The caller is responsible for:
>    *
>    *     ceph_release_page_vector(...)
>    */
>   int ceph_osdc_notify(struct ceph_osd_client *osdc,

Thanks for taking a look. Then there is a rare and unobvious
leak, please check ceph_con_in_msg_alloc() in messenger.c.
Message can be allocated for osd client (->alloc_msg) and then
can be immediately put by the following path:

	if (con->state != CON_STATE_OPEN) {
		if (msg)
			ceph_msg_put(msg);

(also few lines below where con->in_msg is put)

without reaching the dispatch and set of handle_* functions,
which you've referred.

And seems if the ownership of the ->pages is transferred to
the handle_watch_notify() and freed there, then it should be
fixed by having release in one place: here or there.

--
Roman
Ilya Dryomov March 10, 2020, 10:40 a.m. UTC | #3
On Tue, Mar 10, 2020 at 11:15 AM Roman Penyaev <rpenyaev@suse.de> wrote:
>
> On 2020-03-10 11:03, Ilya Dryomov wrote:
> > On Tue, Mar 10, 2020 at 10:09 AM Roman Penyaev <rpenyaev@suse.de>
> > wrote:
> >>
> >> OSD client allocates a message with a page vector for OSD_MAP,
> >> OSD_BACKOFF
> >> and WATCH_NOTIFY message types (see alloc_msg_with_page_vector()
> >> caller),
> >> but pages vector release is never called.
> >>
> >> Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
> >> Cc: Ilya Dryomov <idryomov@gmail.com>
> >> Cc: Jeff Layton <jlayton@kernel.org>
> >> Cc: Sage Weil <sage@redhat.com>
> >> Cc: ceph-devel@vger.kernel.org
> >> ---
> >>  net/ceph/messenger.c | 9 ++++++++-
> >>  1 file changed, 8 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> >> index 5b4bd8261002..28cbd55ec2e3 100644
> >> --- a/net/ceph/messenger.c
> >> +++ b/net/ceph/messenger.c
> >> @@ -3248,8 +3248,15 @@ static struct ceph_msg_data
> >> *ceph_msg_data_add(struct ceph_msg *msg)
> >>
> >>  static void ceph_msg_data_destroy(struct ceph_msg_data *data)
> >>  {
> >> -       if (data->type == CEPH_MSG_DATA_PAGELIST)
> >> +       if (data->type == CEPH_MSG_DATA_PAGES) {
> >> +               int num_pages;
> >> +
> >> +               num_pages = calc_pages_for(data->alignment,
> >> +                                          data->length);
> >> +               ceph_release_page_vector(data->pages, num_pages);
> >> +       } else if (data->type == CEPH_MSG_DATA_PAGELIST) {
> >>                 ceph_pagelist_release(data->pagelist);
> >> +       }
> >>  }
> >>
> >>  void ceph_msg_data_add_pages(struct ceph_msg *msg, struct page
> >> **pages,
> >
> > Hi Roman,
> >
> > I don't think there is a leak here.
> >
> > osdmap and backoff messages don't have data.
>
> I tried to be symmetric on this path: allocation path
> exists, but there is no deallocation.
>
> > watch_notify message may or may not have data and this is dealt
> > with in handle_watch_notify().  The pages are either released in
> > handle_watch_notify() or transferred to ceph_osdc_notify() through
> > lreq.  The caller of ceph_osdc_notify() is then responsible for
> > them:
> >
> >    * @preply_{pages,len} are initialized both on success and error.
> >    * The caller is responsible for:
> >    *
> >    *     ceph_release_page_vector(...)
> >    */
> >   int ceph_osdc_notify(struct ceph_osd_client *osdc,
>
> Thanks for taking a look. Then there is a rare and unobvious
> leak, please check ceph_con_in_msg_alloc() in messenger.c.
> Message can be allocated for osd client (->alloc_msg) and then
> can be immediately put by the following path:
>
>         if (con->state != CON_STATE_OPEN) {
>                 if (msg)
>                         ceph_msg_put(msg);
>
> (also few lines below where con->in_msg is put)
>
> without reaching the dispatch and set of handle_* functions,
> which you've referred.

Yeah, this one is real, and rather old.  That is why there is a TODO
on top of alloc_msg_with_page_vector():

  /*
   * TODO: switch to a msg-owned pagelist
   */

>
> And seems if the ownership of the ->pages is transferred to
> the handle_watch_notify() and freed there, then it should be
> fixed by having release in one place: here or there.

The problem is that at least at one point CEPH_MSG_DATA_PAGES needed
a reference count -- it couldn't be freed it in one place.  pagelists
are reference counted, but can't be easily swapped in, hence that TODO.

Thanks for reminding me about this.  I'll take a look -- perhaps the
reference count is no longer required and we can get away with a simple
flag.

                Ilya
Roman Penyaev March 10, 2020, 12:44 p.m. UTC | #4
On 2020-03-10 11:40, Ilya Dryomov wrote:

[skip]

>> And seems if the ownership of the ->pages is transferred to
>> the handle_watch_notify() and freed there, then it should be
>> fixed by having release in one place: here or there.
> 
> The problem is that at least at one point CEPH_MSG_DATA_PAGES needed
> a reference count -- it couldn't be freed it in one place.  pagelists
> are reference counted, but can't be easily swapped in, hence that TODO.
> 
> Thanks for reminding me about this.  I'll take a look -- perhaps the
> reference count is no longer required and we can get away with a simple
> flag.

To my shallow understanding handle_watch_notify() also has an error
path which eventually does not free or take ownership of data->pages,
e.g. callers of 'goto out_unlock_osdc'. Probably code relies on the
fact, that sender knows what is doing and never sends ->data with
wrong cookie or opcode, but looks very suspicious to me.

Seems for this particular DATA_PAGES case it is possible just to
take an ownership by zeroing out data->pages and data->length,
which prevents double free, something as the following:

diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index b68b376d8c2f..15ae6377c461 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -4440,6 +4440,8 @@ static void handle_watch_notify(struct 
ceph_osd_client *osdc,
                                         
ceph_release_page_vector(data->pages,
                                                calc_pages_for(0, 
data->length));
                                 }
+                               data->pages = NULL;
+                               data->length = 0;
                         }
                         lreq->notify_finish_error = return_code;


and keeping the current patch with explicit call of
ceph_release_page_vector from ceph_msg_data_destroy() from
messenger.c.

--
Roman
Ilya Dryomov March 10, 2020, 7:20 p.m. UTC | #5
On Tue, Mar 10, 2020 at 1:44 PM Roman Penyaev <rpenyaev@suse.de> wrote:
>
> On 2020-03-10 11:40, Ilya Dryomov wrote:
>
> [skip]
>
> >> And seems if the ownership of the ->pages is transferred to
> >> the handle_watch_notify() and freed there, then it should be
> >> fixed by having release in one place: here or there.
> >
> > The problem is that at least at one point CEPH_MSG_DATA_PAGES needed
> > a reference count -- it couldn't be freed it in one place.  pagelists
> > are reference counted, but can't be easily swapped in, hence that TODO.
> >
> > Thanks for reminding me about this.  I'll take a look -- perhaps the
> > reference count is no longer required and we can get away with a simple
> > flag.
>
> To my shallow understanding handle_watch_notify() also has an error
> path which eventually does not free or take ownership of data->pages,
> e.g. callers of 'goto out_unlock_osdc'. Probably code relies on the
> fact, that sender knows what is doing and never sends ->data with
> wrong cookie or opcode, but looks very suspicious to me.
>
> Seems for this particular DATA_PAGES case it is possible just to
> take an ownership by zeroing out data->pages and data->length,
> which prevents double free, something as the following:
>
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index b68b376d8c2f..15ae6377c461 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -4440,6 +4440,8 @@ static void handle_watch_notify(struct
> ceph_osd_client *osdc,
>
> ceph_release_page_vector(data->pages,
>                                                 calc_pages_for(0,
> data->length));
>                                  }
> +                               data->pages = NULL;
> +                               data->length = 0;
>                          }
>                          lreq->notify_finish_error = return_code;
>
>
> and keeping the current patch with explicit call of
> ceph_release_page_vector from ceph_msg_data_destroy() from
> messenger.c.

Not quite -- that would break all of OSD client, which supplies its
own page vectors.

I'll send a patch in a few.

Thanks,

                Ilya
diff mbox series

Patch

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 5b4bd8261002..28cbd55ec2e3 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -3248,8 +3248,15 @@  static struct ceph_msg_data *ceph_msg_data_add(struct ceph_msg *msg)
 
 static void ceph_msg_data_destroy(struct ceph_msg_data *data)
 {
-	if (data->type == CEPH_MSG_DATA_PAGELIST)
+	if (data->type == CEPH_MSG_DATA_PAGES) {
+		int num_pages;
+
+		num_pages = calc_pages_for(data->alignment,
+					   data->length);
+		ceph_release_page_vector(data->pages, num_pages);
+	} else if (data->type == CEPH_MSG_DATA_PAGELIST) {
 		ceph_pagelist_release(data->pagelist);
+	}
 }
 
 void ceph_msg_data_add_pages(struct ceph_msg *msg, struct page **pages,