Issue #5876 : assertion failure in rbd_img_obj_callback()

Message ID	5331C05D.1060008@ieee.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <ceph-devel-owner@kernel.org> Message-ID: <5331C05D.1060008@ieee.org> Date: Tue, 25 Mar 2014 12:43:57 -0500 From: Alex Elder <elder@ieee.org> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 MIME-Version: 1.0 To: Olivier Bonvalet <ceph.list@daevel.fr> CC: Ilya Dryomov <ilya.dryomov@inktank.com>, Ceph Development <ceph-devel@vger.kernel.org> Subject: Re: Issue #5876 : assertion failure in rbd_img_obj_callback() References: <1395736765.2823.29.camel@localhost> <CALFYKtCUR6Swm5yxBP_7y42QVjj=v+XCtag7FmJ61AvoFjUmRA@mail.gmail.com> <53316D18.7040103@ieee.org> <CALFYKtCU4G7pebWV=icEdvL3Lf0aoUWbtxOYk7S2rV5G_UeynA@mail.gmail.com> <53317BC2.9010700@ieee.org> <CALFYKtDKDVRke9PvF3zX1ujHJ+5gTO+yJRrQLj0saZu+Ca8k-w@mail.gmail.com> <1395753516.2823.37.camel@localhost> <533184AF.9050101@ieee.org> <5331853D.40408@ieee.org> <1395767705.9967.5.camel@localhost> In-Reply-To: <1395767705.9967.5.camel@localhost> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk

Alex Elder March 25, 2014, 5:43 p.m. UTC

Please try applying this, on top of the previous patch.
If you can then reproduce the problem we'll have a bunch
of new information about the particular request that's
leading to the failure.  That might tell us what more we
can do to find the root cause.  Thank you.

					-Alex

PS  I hope my mailer doesn't botch the long lines.  It might.

---
 drivers/block/rbd.c |   25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Olivier Bonvalet March 25, 2014, 6:53 p.m. UTC | #1

Le mardi 25 mars 2014 à 12:43 -0500, Alex Elder a écrit :
> Please try applying this, on top of the previous patch.
> If you can then reproduce the problem we'll have a bunch
> of new information about the particular request that's
> leading to the failure.  That might tell us what more we
> can do to find the root cause.  Thank you.
> 
> 					-Alex
> 
> PS  I hope my mailer doesn't botch the long lines.  It might.
> 

Here the execution will continue, no more kernel panic after this
debugging display. Is it wanted ?

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alex Elder March 25, 2014, 7:03 p.m. UTC | #2

On 03/25/2014 01:53 PM, Olivier Bonvalet wrote:
> Le mardi 25 mars 2014 à 12:43 -0500, Alex Elder a écrit :
>> Please try applying this, on top of the previous patch.
>> If you can then reproduce the problem we'll have a bunch
>> of new information about the particular request that's
>> leading to the failure.  That might tell us what more we
>> can do to find the root cause.  Thank you.
>>
>> 					-Alex
>>
>> PS  I hope my mailer doesn't botch the long lines.  It might.
>>
> 
> Here the execution will continue, no more kernel panic after this
> debugging display. Is it wanted ?


I guess it should panic.  I'm glad you mentioned this.

I'm sorry to have it reproducing on a production cluster.
Unfortunately at this point it's the only place we've
been able to reproduce it (and with any luck we'll find
the problem soon).

					-Alex
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ilya Dryomov March 25, 2014, 8:18 p.m. UTC | #3

On Tue, Mar 25, 2014 at 9:03 PM, Alex Elder <elder@ieee.org> wrote:
> On 03/25/2014 01:53 PM, Olivier Bonvalet wrote:
>> Le mardi 25 mars 2014 à 12:43 -0500, Alex Elder a écrit :
>>> Please try applying this, on top of the previous patch.
>>> If you can then reproduce the problem we'll have a bunch
>>> of new information about the particular request that's
>>> leading to the failure.  That might tell us what more we
>>> can do to find the root cause.  Thank you.
>>>
>>>                                      -Alex
>>>
>>> PS  I hope my mailer doesn't botch the long lines.  It might.
>>>
>>
>> Here the execution will continue, no more kernel panic after this
>> debugging display. Is it wanted ?
>
>
> I guess it should panic.  I'm glad you mentioned this.

Just in case, if you haven't done it already: stick rbd_assert(0);
after the last printk in that if statement, so it looks like this:

if (which != img_request->next_completion) {
        printk("%s: bad image object request information:\n", __func__);
        printk("obj_request %p\n", obj_request);
        printk("    ->object_name <%s>\n", obj_request->object_name);
        ...

        printk("img_request %p\n", img_request);
        printk("    ->snap 0x%016llx\n", img_request->snap_id);
        ...
        printk("    ->result %d\n", img_request->result);

        rbd_assert(0);
}

Thanks,

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Olivier Bonvalet March 25, 2014, 8:21 p.m. UTC | #4

Le mardi 25 mars 2014 à 22:18 +0200, Ilya Dryomov a écrit :
> On Tue, Mar 25, 2014 at 9:03 PM, Alex Elder <elder@ieee.org> wrote:
> > On 03/25/2014 01:53 PM, Olivier Bonvalet wrote:
> >> Le mardi 25 mars 2014 à 12:43 -0500, Alex Elder a écrit :
> >>> Please try applying this, on top of the previous patch.
> >>> If you can then reproduce the problem we'll have a bunch
> >>> of new information about the particular request that's
> >>> leading to the failure.  That might tell us what more we
> >>> can do to find the root cause.  Thank you.
> >>>
> >>>                                      -Alex
> >>>
> >>> PS  I hope my mailer doesn't botch the long lines.  It might.
> >>>
> >>
> >> Here the execution will continue, no more kernel panic after this
> >> debugging display. Is it wanted ?
> >
> >
> > I guess it should panic.  I'm glad you mentioned this.
> 
> Just in case, if you haven't done it already: stick rbd_assert(0);
> after the last printk in that if statement, so it looks like this:
> 
> if (which != img_request->next_completion) {
>         printk("%s: bad image object request information:\n", __func__);
>         printk("obj_request %p\n", obj_request);
>         printk("    ->object_name <%s>\n", obj_request->object_name);
>         ...
> 
>         printk("img_request %p\n", img_request);
>         printk("    ->snap 0x%016llx\n", img_request->snap_id);
>         ...
>         printk("    ->result %d\n", img_request->result);
> 
>         rbd_assert(0);
> }
> 
> Thanks,
> 
>                 Ilya
> 

Without the rbd_assert(0), I add this hang :


Mar 25 21:17:58 murmillia kernel: [ 2205.255933] rbd_img_obj_callback: bad image object request information:
Mar 25 21:17:58 murmillia kernel: [ 2205.255938] obj_request ffff88025a2b3c48
Mar 25 21:17:58 murmillia kernel: [ 2205.255940]     ->object_name <rb.0.1536881.238e1f29.000000000439>
Mar 25 21:17:58 murmillia kernel: [ 2205.255941]     ->offset 0
Mar 25 21:17:58 murmillia kernel: [ 2205.255943]     ->length 28672
Mar 25 21:17:58 murmillia kernel: [ 2205.255944]     ->type 0x1
Mar 25 21:17:58 murmillia kernel: [ 2205.255945]     ->flags 0x3
Mar 25 21:17:58 murmillia kernel: [ 2205.255946]     ->which 1
Mar 25 21:17:58 murmillia kernel: [ 2205.255948]     ->xferred 28672
Mar 25 21:17:58 murmillia kernel: [ 2205.255949]     ->result 0
Mar 25 21:17:58 murmillia kernel: [ 2205.255950] img_request ffff8802536c4a60
Mar 25 21:17:58 murmillia kernel: [ 2205.255952]     ->snap 0xffff880257f85ec0
Mar 25 21:17:58 murmillia kernel: [ 2205.255953]     ->offset 4534026240
Mar 25 21:17:58 murmillia kernel: [ 2205.255954]     ->length 45056
Mar 25 21:17:58 murmillia kernel: [ 2205.255955]     ->flags 0x1
Mar 25 21:17:58 murmillia kernel: [ 2205.255957]     ->obj_request_count 1
Mar 25 21:17:58 murmillia kernel: [ 2205.255958]     ->next_completion 2
Mar 25 21:17:58 murmillia kernel: [ 2205.255959]     ->xferred 45056
Mar 25 21:17:58 murmillia kernel: [ 2205.255960]     ->result 0
Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 
Mar 25 21:17:58 murmillia kernel: [ 2205.255962] Assertion failure in rbd_img_obj_callback() at line 2162:
Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 
Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 	rbd_assert(which < img_request->obj_request_count);
Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 
Mar 25 21:17:58 murmillia kernel: [ 2205.256141] ------------[ cut here ]------------
Mar 25 21:17:58 murmillia kernel: [ 2205.256178] kernel BUG at drivers/block/rbd.c:2162!


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alex Elder March 25, 2014, 8:24 p.m. UTC | #5

On 03/25/2014 03:21 PM, Olivier Bonvalet wrote:
> Le mardi 25 mars 2014 à 22:18 +0200, Ilya Dryomov a écrit :
>> On Tue, Mar 25, 2014 at 9:03 PM, Alex Elder <elder@ieee.org> wrote:
>>> On 03/25/2014 01:53 PM, Olivier Bonvalet wrote:
>>>> Le mardi 25 mars 2014 à 12:43 -0500, Alex Elder a écrit :
>>>>> Please try applying this, on top of the previous patch.
>>>>> If you can then reproduce the problem we'll have a bunch
>>>>> of new information about the particular request that's
>>>>> leading to the failure.  That might tell us what more we
>>>>> can do to find the root cause.  Thank you.
>>>>>
>>>>>                                      -Alex
>>>>>
>>>>> PS  I hope my mailer doesn't botch the long lines.  It might.
>>>>>
>>>>
>>>> Here the execution will continue, no more kernel panic after this
>>>> debugging display. Is it wanted ?
>>>
>>>
>>> I guess it should panic.  I'm glad you mentioned this.
>>
>> Just in case, if you haven't done it already: stick rbd_assert(0);
>> after the last printk in that if statement, so it looks like this:
>>
>> if (which != img_request->next_completion) {
>>         printk("%s: bad image object request information:\n", __func__);
>>         printk("obj_request %p\n", obj_request);
>>         printk("    ->object_name <%s>\n", obj_request->object_name);
>>         ...
>>
>>         printk("img_request %p\n", img_request);
>>         printk("    ->snap 0x%016llx\n", img_request->snap_id);
>>         ...
>>         printk("    ->result %d\n", img_request->result);
>>
>>         rbd_assert(0);
>> }
>>
>> Thanks,
>>
>>                 Ilya
>>
> 
> Without the rbd_assert(0), I add this hang :
> 
> 
> Mar 25 21:17:58 murmillia kernel: [ 2205.255933] rbd_img_obj_callback: bad image object request information:
> Mar 25 21:17:58 murmillia kernel: [ 2205.255938] obj_request ffff88025a2b3c48
> Mar 25 21:17:58 murmillia kernel: [ 2205.255940]     ->object_name <rb.0.1536881.238e1f29.000000000439>
> Mar 25 21:17:58 murmillia kernel: [ 2205.255941]     ->offset 0
> Mar 25 21:17:58 murmillia kernel: [ 2205.255943]     ->length 28672
> Mar 25 21:17:58 murmillia kernel: [ 2205.255944]     ->type 0x1
		BIO request

> Mar 25 21:17:58 murmillia kernel: [ 2205.255945]     ->flags 0x3
	IMG_DATA, KNOWN

> Mar 25 21:17:58 murmillia kernel: [ 2205.255946]     ->which 1
	Second object in the request

> Mar 25 21:17:58 murmillia kernel: [ 2205.255948]     ->xferred 28672
> Mar 25 21:17:58 murmillia kernel: [ 2205.255949]     ->result 0
> Mar 25 21:17:58 murmillia kernel: [ 2205.255950] img_request ffff8802536c4a60
> Mar 25 21:17:58 murmillia kernel: [ 2205.255952]     ->snap 0xffff880257f85ec0
> Mar 25 21:17:58 murmillia kernel: [ 2205.255953]     ->offset 4534026240
> Mar 25 21:17:58 murmillia kernel: [ 2205.255954]     ->length 45056
> Mar 25 21:17:58 murmillia kernel: [ 2205.255955]     ->flags 0x1
> Mar 25 21:17:58 murmillia kernel: [ 2205.255957]     ->obj_request_count 1
	!!! There is only one request... (?)

So obj_request_count might be getting computed incorrectly.

					-Alex

> Mar 25 21:17:58 murmillia kernel: [ 2205.255958]     ->next_completion 2
> Mar 25 21:17:58 murmillia kernel: [ 2205.255959]     ->xferred 45056
> Mar 25 21:17:58 murmillia kernel: [ 2205.255960]     ->result 0
> Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 
> Mar 25 21:17:58 murmillia kernel: [ 2205.255962] Assertion failure in rbd_img_obj_callback() at line 2162:
> Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 
> Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 	rbd_assert(which < img_request->obj_request_count);
> Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 
> Mar 25 21:17:58 murmillia kernel: [ 2205.256141] ------------[ cut here ]------------
> Mar 25 21:17:58 murmillia kernel: [ 2205.256178] kernel BUG at drivers/block/rbd.c:2162!
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Olivier Bonvalet March 25, 2014, 8:29 p.m. UTC | #6

Le mardi 25 mars 2014 à 15:24 -0500, Alex Elder a écrit :
> On 03/25/2014 03:21 PM, Olivier Bonvalet wrote:
> > Le mardi 25 mars 2014 à 22:18 +0200, Ilya Dryomov a écrit :
> >> On Tue, Mar 25, 2014 at 9:03 PM, Alex Elder <elder@ieee.org> wrote:
> >>> On 03/25/2014 01:53 PM, Olivier Bonvalet wrote:
> >>>> Le mardi 25 mars 2014 à 12:43 -0500, Alex Elder a écrit :
> >>>>> Please try applying this, on top of the previous patch.
> >>>>> If you can then reproduce the problem we'll have a bunch
> >>>>> of new information about the particular request that's
> >>>>> leading to the failure.  That might tell us what more we
> >>>>> can do to find the root cause.  Thank you.
> >>>>>
> >>>>>                                      -Alex
> >>>>>
> >>>>> PS  I hope my mailer doesn't botch the long lines.  It might.
> >>>>>
> >>>>
> >>>> Here the execution will continue, no more kernel panic after this
> >>>> debugging display. Is it wanted ?
> >>>
> >>>
> >>> I guess it should panic.  I'm glad you mentioned this.
> >>
> >> Just in case, if you haven't done it already: stick rbd_assert(0);
> >> after the last printk in that if statement, so it looks like this:
> >>
> >> if (which != img_request->next_completion) {
> >>         printk("%s: bad image object request information:\n", __func__);
> >>         printk("obj_request %p\n", obj_request);
> >>         printk("    ->object_name <%s>\n", obj_request->object_name);
> >>         ...
> >>
> >>         printk("img_request %p\n", img_request);
> >>         printk("    ->snap 0x%016llx\n", img_request->snap_id);
> >>         ...
> >>         printk("    ->result %d\n", img_request->result);
> >>
> >>         rbd_assert(0);
> >> }
> >>
> >> Thanks,
> >>
> >>                 Ilya
> >>
> > 
> > Without the rbd_assert(0), I add this hang :
> > 
> > 
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255933] rbd_img_obj_callback: bad image object request information:
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255938] obj_request ffff88025a2b3c48
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255940]     ->object_name <rb.0.1536881.238e1f29.000000000439>
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255941]     ->offset 0
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255943]     ->length 28672
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255944]     ->type 0x1
> 		BIO request
> 
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255945]     ->flags 0x3
> 	IMG_DATA, KNOWN
> 
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255946]     ->which 1
> 	Second object in the request
> 
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255948]     ->xferred 28672
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255949]     ->result 0
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255950] img_request ffff8802536c4a60
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255952]     ->snap 0xffff880257f85ec0
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255953]     ->offset 4534026240
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255954]     ->length 45056
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255955]     ->flags 0x1
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255957]     ->obj_request_count 1
> 	!!! There is only one request... (?)
> 
> So obj_request_count might be getting computed incorrectly.
> 
> 					-Alex
> 
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255958]     ->next_completion 2
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255959]     ->xferred 45056
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255960]     ->result 0
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255962] Assertion failure in rbd_img_obj_callback() at line 2162:
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 	rbd_assert(which < img_request->obj_request_count);
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 
> > Mar 25 21:17:58 murmillia kernel: [ 2205.256141] ------------[ cut here ]------------
> > Mar 25 21:17:58 murmillia kernel: [ 2205.256178] kernel BUG at drivers/block/rbd.c:2162!
> > 
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


Here I was migrating (xen live migrate) ~20 VM from one host to
"murmillia", just after booting "murmillia" on the new kernel.

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alex Elder March 25, 2014, 8:41 p.m. UTC | #7

On 03/25/2014 03:21 PM, Olivier Bonvalet wrote:
> Mar 25 21:17:58 murmillia kernel: [ 2205.255933] rbd_img_obj_callback: bad image object request information:
> Mar 25 21:17:58 murmillia kernel: [ 2205.255938] obj_request ffff88025a2b3c48
> Mar 25 21:17:58 murmillia kernel: [ 2205.255940]     ->object_name <rb.0.1536881.238e1f29.000000000439>
> Mar 25 21:17:58 murmillia kernel: [ 2205.255941]     ->offset 0
> Mar 25 21:17:58 murmillia kernel: [ 2205.255943]     ->length 28672
> Mar 25 21:17:58 murmillia kernel: [ 2205.255944]     ->type 0x1
> Mar 25 21:17:58 murmillia kernel: [ 2205.255945]     ->flags 0x3
> Mar 25 21:17:58 murmillia kernel: [ 2205.255946]     ->which 1
> Mar 25 21:17:58 murmillia kernel: [ 2205.255948]     ->xferred 28672
> Mar 25 21:17:58 murmillia kernel: [ 2205.255949]     ->result 0
> Mar 25 21:17:58 murmillia kernel: [ 2205.255950] img_request ffff8802536c4a60
> Mar 25 21:17:58 murmillia kernel: [ 2205.255952]     ->snap 0xffff880257f85ec0
> Mar 25 21:17:58 murmillia kernel: [ 2205.255953]     ->offset 4534026240
> Mar 25 21:17:58 murmillia kernel: [ 2205.255954]     ->length 45056
> Mar 25 21:17:58 murmillia kernel: [ 2205.255955]     ->flags 0x1

The image request has IMG_REQ_CHILD set.  That means it's a request
sent to a parent image.

I see one thing that I'm suspicious of, but I'll need to look at it
a bit to see if it's really a problem.

In rbd_parent_request_destroy(), should this last line:
        rbd_img_request_destroy(kref);
be this instead?
	rbd_img_request_put(parent_request);

And get rid of this line above that:
        img_request_child_clear(parent_request);

Something to look at.  Also the fact that it's a parent
request with two objects involved seems strange, but
again, I'm just speculating at this point.

					-Alex

> Mar 25 21:17:58 murmillia kernel: [ 2205.255957]     ->obj_request_count 1
> Mar 25 21:17:58 murmillia kernel: [ 2205.255958]     ->next_completion 2
> Mar 25 21:17:58 murmillia kernel: [ 2205.255959]     ->xferred 45056
> Mar 25 21:17:58 murmillia kernel: [ 2205.255960]     ->result 0

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alex Elder March 25, 2014, 8:44 p.m. UTC | #8

Olivier, it appears this is a layered image, i.e., the image
is a clone of another.  Can you tell us any more about the
way these images are organized?  Do you have one master image
and others are based on that?  Is there more than one layer to
the organization?  (Do these questions make sense?)

					-Alex
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Olivier Bonvalet March 25, 2014, 8:53 p.m. UTC | #9

Le mardi 25 mars 2014 à 21:21 +0100, Olivier Bonvalet a écrit :
> Le mardi 25 mars 2014 à 22:18 +0200, Ilya Dryomov a écrit :
> > On Tue, Mar 25, 2014 at 9:03 PM, Alex Elder <elder@ieee.org> wrote:
> > > On 03/25/2014 01:53 PM, Olivier Bonvalet wrote:
> > >> Le mardi 25 mars 2014 à 12:43 -0500, Alex Elder a écrit :
> > >>> Please try applying this, on top of the previous patch.
> > >>> If you can then reproduce the problem we'll have a bunch
> > >>> of new information about the particular request that's
> > >>> leading to the failure.  That might tell us what more we
> > >>> can do to find the root cause.  Thank you.
> > >>>
> > >>>                                      -Alex
> > >>>
> > >>> PS  I hope my mailer doesn't botch the long lines.  It might.
> > >>>
> > >>
> > >> Here the execution will continue, no more kernel panic after this
> > >> debugging display. Is it wanted ?
> > >
> > >
> > > I guess it should panic.  I'm glad you mentioned this.
> > 
> > Just in case, if you haven't done it already: stick rbd_assert(0);
> > after the last printk in that if statement, so it looks like this:
> > 
> > if (which != img_request->next_completion) {
> >         printk("%s: bad image object request information:\n", __func__);
> >         printk("obj_request %p\n", obj_request);
> >         printk("    ->object_name <%s>\n", obj_request->object_name);
> >         ...
> > 
> >         printk("img_request %p\n", img_request);
> >         printk("    ->snap 0x%016llx\n", img_request->snap_id);
> >         ...
> >         printk("    ->result %d\n", img_request->result);
> > 
> >         rbd_assert(0);
> > }
> > 
> > Thanks,
> > 
> >                 Ilya
> > 
> 
> Without the rbd_assert(0), I add this hang :
> 
> 
> Mar 25 21:17:58 murmillia kernel: [ 2205.255933] rbd_img_obj_callback: bad image object request information:
> Mar 25 21:17:58 murmillia kernel: [ 2205.255938] obj_request ffff88025a2b3c48
> Mar 25 21:17:58 murmillia kernel: [ 2205.255940]     ->object_name <rb.0.1536881.238e1f29.000000000439>
> Mar 25 21:17:58 murmillia kernel: [ 2205.255941]     ->offset 0
> Mar 25 21:17:58 murmillia kernel: [ 2205.255943]     ->length 28672
> Mar 25 21:17:58 murmillia kernel: [ 2205.255944]     ->type 0x1
> Mar 25 21:17:58 murmillia kernel: [ 2205.255945]     ->flags 0x3
> Mar 25 21:17:58 murmillia kernel: [ 2205.255946]     ->which 1
> Mar 25 21:17:58 murmillia kernel: [ 2205.255948]     ->xferred 28672
> Mar 25 21:17:58 murmillia kernel: [ 2205.255949]     ->result 0
> Mar 25 21:17:58 murmillia kernel: [ 2205.255950] img_request ffff8802536c4a60
> Mar 25 21:17:58 murmillia kernel: [ 2205.255952]     ->snap 0xffff880257f85ec0
> Mar 25 21:17:58 murmillia kernel: [ 2205.255953]     ->offset 4534026240
> Mar 25 21:17:58 murmillia kernel: [ 2205.255954]     ->length 45056
> Mar 25 21:17:58 murmillia kernel: [ 2205.255955]     ->flags 0x1
> Mar 25 21:17:58 murmillia kernel: [ 2205.255957]     ->obj_request_count 1
> Mar 25 21:17:58 murmillia kernel: [ 2205.255958]     ->next_completion 2
> Mar 25 21:17:58 murmillia kernel: [ 2205.255959]     ->xferred 45056
> Mar 25 21:17:58 murmillia kernel: [ 2205.255960]     ->result 0
> Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 
> Mar 25 21:17:58 murmillia kernel: [ 2205.255962] Assertion failure in rbd_img_obj_callback() at line 2162:
> Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 
> Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 	rbd_assert(which < img_request->obj_request_count);
> Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 
> Mar 25 21:17:58 murmillia kernel: [ 2205.256141] ------------[ cut here ]------------
> Mar 25 21:17:58 murmillia kernel: [ 2205.256178] kernel BUG at drivers/block/rbd.c:2162!
> 
> 
> --

An other one :

Mar 25 21:52:50 alg kernel: [ 1781.377690] rbd_img_obj_callback: bad image object request information:
Mar 25 21:52:50 alg kernel: [ 1781.377696] obj_request ffff88021dda2ae8
Mar 25 21:52:50 alg kernel: [ 1781.377698]     ->object_name <(null)>
Mar 25 21:52:50 alg kernel: [ 1781.377699]     ->offset 0
Mar 25 21:52:50 alg kernel: [ 1781.377701]     ->length 12288
Mar 25 21:52:50 alg kernel: [ 1781.377702]     ->type 0x1
Mar 25 21:52:50 alg kernel: [ 1781.377703]     ->flags 0x3
Mar 25 21:52:50 alg kernel: [ 1781.377704]     ->which 4294967295
Mar 25 21:52:50 alg kernel: [ 1781.377705]     ->xferred 12288
Mar 25 21:52:50 alg kernel: [ 1781.377706]     ->result 0
Mar 25 21:52:50 alg kernel: [ 1781.377707] img_request ffff880223f396a0
Mar 25 21:52:50 alg kernel: [ 1781.377709]     ->snap 0xffff880231dd8cc0
Mar 25 21:52:50 alg kernel: [ 1781.377710]     ->offset 1119846400
Mar 25 21:52:50 alg kernel: [ 1781.377711]     ->length 45056
Mar 25 21:52:50 alg kernel: [ 1781.377712]     ->flags 0x1
Mar 25 21:52:50 alg kernel: [ 1781.377713]     ->obj_request_count 0
Mar 25 21:52:50 alg kernel: [ 1781.377713]     ->next_completion 2
Mar 25 21:52:50 alg kernel: [ 1781.377714]     ->xferred 45056
Mar 25 21:52:50 alg kernel: [ 1781.377715]     ->result 0
Mar 25 21:52:50 alg kernel: [ 1781.377717] 
Mar 25 21:52:50 alg kernel: [ 1781.377717] Assertion failure in rbd_img_obj_callback() at line 2162:
Mar 25 21:52:50 alg kernel: [ 1781.377717] 
Mar 25 21:52:50 alg kernel: [ 1781.377717] 	rbd_assert(which < img_request->obj_request_count);
Mar 25 21:52:50 alg kernel: [ 1781.377717] 
Mar 25 21:52:50 alg kernel: [ 1781.377859] ------------[ cut here ]------------


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Olivier Bonvalet March 25, 2014, 9:03 p.m. UTC | #10

For me all this images are RBD v1 images, without layering.

# rbd info sas3copies/env4-spool
rbd image 'env4-spool':
	size 10240 MB in 2560 objects
	order 22 (4096 KB objects)
	block_name_prefix: rb.0.1536881.238e1f29
	format: 1

(the one reported here : 
Mar 25 21:17:58 murmillia kernel: [ 2205.255938] obj_request ffff88025a2b3c48
Mar 25 21:17:58 murmillia kernel: [ 2205.255940]     ->object_name <rb.0.1536881.238e1f29.000000000439>
Mar 25 21:17:58 murmillia kernel: [ 2205.255941]     ->offset 0
Mar 25 21:17:58 murmillia kernel: [ 2205.255943]     ->length 28672
Mar 25 21:17:58 murmillia kernel: [ 2205.255944]     ->type 0x1
Mar 25 21:17:58 murmillia kernel: [ 2205.255945]     ->flags 0x3
Mar 25 21:17:58 murmillia kernel: [ 2205.255946]     ->which 1
Mar 25 21:17:58 murmillia kernel: [ 2205.255948]     ->xferred 28672
Mar 25 21:17:58 murmillia kernel: [ 2205.255949]     ->result 0
)

At VM level, this block device is formated in ext4, and used for Exim4
(MTA) spool, and can handle a lots of writes.

And this ceph pool use 3 replica, with a "per network" CRUSH rule.


Le mardi 25 mars 2014 à 15:44 -0500, Alex Elder a écrit :
> Olivier, it appears this is a layered image, i.e., the image
> is a clone of another.  Can you tell us any more about the
> way these images are organized?  Do you have one master image
> and others are based on that?  Is there more than one layer to
> the organization?  (Do these questions make sense?)
> 
> 					-Alex
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Olivier Bonvalet March 25, 2014, 9:10 p.m. UTC | #11

Le mardi 25 mars 2014 à 21:53 +0100, Olivier Bonvalet a écrit :
> Le mardi 25 mars 2014 à 21:21 +0100, Olivier Bonvalet a écrit :
> > Le mardi 25 mars 2014 à 22:18 +0200, Ilya Dryomov a écrit :
> > > On Tue, Mar 25, 2014 at 9:03 PM, Alex Elder <elder@ieee.org> wrote:
> > > > On 03/25/2014 01:53 PM, Olivier Bonvalet wrote:
> > > >> Le mardi 25 mars 2014 à 12:43 -0500, Alex Elder a écrit :
> > > >>> Please try applying this, on top of the previous patch.
> > > >>> If you can then reproduce the problem we'll have a bunch
> > > >>> of new information about the particular request that's
> > > >>> leading to the failure.  That might tell us what more we
> > > >>> can do to find the root cause.  Thank you.
> > > >>>
> > > >>>                                      -Alex
> > > >>>
> > > >>> PS  I hope my mailer doesn't botch the long lines.  It might.
> > > >>>
> > > >>
> > > >> Here the execution will continue, no more kernel panic after this
> > > >> debugging display. Is it wanted ?
> > > >
> > > >
> > > > I guess it should panic.  I'm glad you mentioned this.
> > > 
> > > Just in case, if you haven't done it already: stick rbd_assert(0);
> > > after the last printk in that if statement, so it looks like this:
> > > 
> > > if (which != img_request->next_completion) {
> > >         printk("%s: bad image object request information:\n", __func__);
> > >         printk("obj_request %p\n", obj_request);
> > >         printk("    ->object_name <%s>\n", obj_request->object_name);
> > >         ...
> > > 
> > >         printk("img_request %p\n", img_request);
> > >         printk("    ->snap 0x%016llx\n", img_request->snap_id);
> > >         ...
> > >         printk("    ->result %d\n", img_request->result);
> > > 
> > >         rbd_assert(0);
> > > }
> > > 
> > > Thanks,
> > > 
> > >                 Ilya
> > > 
> > 
> > Without the rbd_assert(0), I add this hang :
> > 
> > 
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255933] rbd_img_obj_callback: bad image object request information:
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255938] obj_request ffff88025a2b3c48
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255940]     ->object_name <rb.0.1536881.238e1f29.000000000439>
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255941]     ->offset 0
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255943]     ->length 28672
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255944]     ->type 0x1
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255945]     ->flags 0x3
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255946]     ->which 1
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255948]     ->xferred 28672
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255949]     ->result 0
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255950] img_request ffff8802536c4a60
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255952]     ->snap 0xffff880257f85ec0
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255953]     ->offset 4534026240
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255954]     ->length 45056
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255955]     ->flags 0x1
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255957]     ->obj_request_count 1
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255958]     ->next_completion 2
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255959]     ->xferred 45056
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255960]     ->result 0
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255962] Assertion failure in rbd_img_obj_callback() at line 2162:
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 	rbd_assert(which < img_request->obj_request_count);
> > Mar 25 21:17:58 murmillia kernel: [ 2205.255962] 
> > Mar 25 21:17:58 murmillia kernel: [ 2205.256141] ------------[ cut here ]------------
> > Mar 25 21:17:58 murmillia kernel: [ 2205.256178] kernel BUG at drivers/block/rbd.c:2162!
> > 
> > 
> > --
> 
> An other one :
> 
> Mar 25 21:52:50 alg kernel: [ 1781.377690] rbd_img_obj_callback: bad image object request information:
> Mar 25 21:52:50 alg kernel: [ 1781.377696] obj_request ffff88021dda2ae8
> Mar 25 21:52:50 alg kernel: [ 1781.377698]     ->object_name <(null)>
> Mar 25 21:52:50 alg kernel: [ 1781.377699]     ->offset 0
> Mar 25 21:52:50 alg kernel: [ 1781.377701]     ->length 12288
> Mar 25 21:52:50 alg kernel: [ 1781.377702]     ->type 0x1
> Mar 25 21:52:50 alg kernel: [ 1781.377703]     ->flags 0x3
> Mar 25 21:52:50 alg kernel: [ 1781.377704]     ->which 4294967295
> Mar 25 21:52:50 alg kernel: [ 1781.377705]     ->xferred 12288
> Mar 25 21:52:50 alg kernel: [ 1781.377706]     ->result 0
> Mar 25 21:52:50 alg kernel: [ 1781.377707] img_request ffff880223f396a0
> Mar 25 21:52:50 alg kernel: [ 1781.377709]     ->snap 0xffff880231dd8cc0
> Mar 25 21:52:50 alg kernel: [ 1781.377710]     ->offset 1119846400
> Mar 25 21:52:50 alg kernel: [ 1781.377711]     ->length 45056
> Mar 25 21:52:50 alg kernel: [ 1781.377712]     ->flags 0x1
> Mar 25 21:52:50 alg kernel: [ 1781.377713]     ->obj_request_count 0
> Mar 25 21:52:50 alg kernel: [ 1781.377713]     ->next_completion 2
> Mar 25 21:52:50 alg kernel: [ 1781.377714]     ->xferred 45056
> Mar 25 21:52:50 alg kernel: [ 1781.377715]     ->result 0
> Mar 25 21:52:50 alg kernel: [ 1781.377717] 
> Mar 25 21:52:50 alg kernel: [ 1781.377717] Assertion failure in rbd_img_obj_callback() at line 2162:
> Mar 25 21:52:50 alg kernel: [ 1781.377717] 
> Mar 25 21:52:50 alg kernel: [ 1781.377717] 	rbd_assert(which < img_request->obj_request_count);
> Mar 25 21:52:50 alg kernel: [ 1781.377717] 
> Mar 25 21:52:50 alg kernel: [ 1781.377859] ------------[ cut here ]------------
> 
> 
> --

The third (now with rbd_assort(0)) :

Mar 25 22:08:12 alg kernel: [  598.301895] rbd_img_obj_callback: bad image object request information:
Mar 25 22:08:12 alg kernel: [  598.301900] obj_request ffff88022409e1b8
Mar 25 22:08:12 alg kernel: [  598.301901]     ->object_name <(null)>
Mar 25 22:08:12 alg kernel: [  598.301902]     ->offset 0
Mar 25 22:08:12 alg kernel: [  598.301903]     ->length 8192
Mar 25 22:08:12 alg kernel: [  598.301904]     ->type 0x1
Mar 25 22:08:12 alg kernel: [  598.301905]     ->flags 0x3
Mar 25 22:08:12 alg kernel: [  598.301906]     ->which 4294967295
Mar 25 22:08:12 alg kernel: [  598.301906]     ->xferred 8192
Mar 25 22:08:12 alg kernel: [  598.301907]     ->result 0
Mar 25 22:08:12 alg kernel: [  598.301908] img_request ffff8802303bff10
Mar 25 22:08:12 alg kernel: [  598.301909]     ->snap 0xffff88022711f500
Mar 25 22:08:12 alg kernel: [  598.301910]     ->offset 4492079104
Mar 25 22:08:12 alg kernel: [  598.301911]     ->length 28672
Mar 25 22:08:12 alg kernel: [  598.301912]     ->flags 0x1
Mar 25 22:08:12 alg kernel: [  598.301913]     ->obj_request_count 0
Mar 25 22:08:12 alg kernel: [  598.301913]     ->next_completion 2
Mar 25 22:08:12 alg kernel: [  598.301914]     ->xferred 28672
Mar 25 22:08:12 alg kernel: [  598.301915]     ->result 0
Mar 25 22:08:12 alg kernel: [  598.301916] 
Mar 25 22:08:12 alg kernel: [  598.301916] Assertion failure in rbd_img_obj_callback() at line 2159:
Mar 25 22:08:12 alg kernel: [  598.301916] 
Mar 25 22:08:12 alg kernel: [  598.301916] 	rbd_assert(0);
Mar 25 22:08:12 alg kernel: [  598.301916] 
Mar 25 22:08:12 alg kernel: [  598.302093] ------------[ cut here ]------------




--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ilya Dryomov March 25, 2014, 9:20 p.m. UTC | #12

On Tue, Mar 25, 2014 at 11:10 PM, Olivier Bonvalet <ceph.list@daevel.fr> wrote:
> The third (now with rbd_assort(0)) :
>
> Mar 25 22:08:12 alg kernel: [  598.301895] rbd_img_obj_callback: bad image object request information:
> Mar 25 22:08:12 alg kernel: [  598.301900] obj_request ffff88022409e1b8
> Mar 25 22:08:12 alg kernel: [  598.301901]     ->object_name <(null)>
> Mar 25 22:08:12 alg kernel: [  598.301902]     ->offset 0
> Mar 25 22:08:12 alg kernel: [  598.301903]     ->length 8192
> Mar 25 22:08:12 alg kernel: [  598.301904]     ->type 0x1
> Mar 25 22:08:12 alg kernel: [  598.301905]     ->flags 0x3
> Mar 25 22:08:12 alg kernel: [  598.301906]     ->which 4294967295
> Mar 25 22:08:12 alg kernel: [  598.301906]     ->xferred 8192
> Mar 25 22:08:12 alg kernel: [  598.301907]     ->result 0
> Mar 25 22:08:12 alg kernel: [  598.301908] img_request ffff8802303bff10
> Mar 25 22:08:12 alg kernel: [  598.301909]     ->snap 0xffff88022711f500
> Mar 25 22:08:12 alg kernel: [  598.301910]     ->offset 4492079104
> Mar 25 22:08:12 alg kernel: [  598.301911]     ->length 28672
> Mar 25 22:08:12 alg kernel: [  598.301912]     ->flags 0x1
> Mar 25 22:08:12 alg kernel: [  598.301913]     ->obj_request_count 0
> Mar 25 22:08:12 alg kernel: [  598.301913]     ->next_completion 2
> Mar 25 22:08:12 alg kernel: [  598.301914]     ->xferred 28672
> Mar 25 22:08:12 alg kernel: [  598.301915]     ->result 0
> Mar 25 22:08:12 alg kernel: [  598.301916]
> Mar 25 22:08:12 alg kernel: [  598.301916] Assertion failure in rbd_img_obj_callback() at line 2159:
> Mar 25 22:08:12 alg kernel: [  598.301916]
> Mar 25 22:08:12 alg kernel: [  598.301916]      rbd_assert(0);
> Mar 25 22:08:12 alg kernel: [  598.301916]
> Mar 25 22:08:12 alg kernel: [  598.302093] ------------[ cut here ]------------

Hmm...  Can you attach your drivers/block/rbd.c as of this third
occurence?

Thanks,

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ilya Dryomov March 25, 2014, 9:25 p.m. UTC | #13

On Tue, Mar 25, 2014 at 11:22 PM, Olivier Bonvalet <ceph.list@daevel.fr> wrote:
> Le mardi 25 mars 2014 à 23:20 +0200, Ilya Dryomov a écrit :
>> On Tue, Mar 25, 2014 at 11:10 PM, Olivier Bonvalet <ceph.list@daevel.fr> wrote:
>> > The third (now with rbd_assort(0)) :
>> >
>> > Mar 25 22:08:12 alg kernel: [  598.301895] rbd_img_obj_callback: bad image object request information:
>> > Mar 25 22:08:12 alg kernel: [  598.301900] obj_request ffff88022409e1b8
>> > Mar 25 22:08:12 alg kernel: [  598.301901]     ->object_name <(null)>
>> > Mar 25 22:08:12 alg kernel: [  598.301902]     ->offset 0
>> > Mar 25 22:08:12 alg kernel: [  598.301903]     ->length 8192
>> > Mar 25 22:08:12 alg kernel: [  598.301904]     ->type 0x1
>> > Mar 25 22:08:12 alg kernel: [  598.301905]     ->flags 0x3
>> > Mar 25 22:08:12 alg kernel: [  598.301906]     ->which 4294967295
>> > Mar 25 22:08:12 alg kernel: [  598.301906]     ->xferred 8192
>> > Mar 25 22:08:12 alg kernel: [  598.301907]     ->result 0
>> > Mar 25 22:08:12 alg kernel: [  598.301908] img_request ffff8802303bff10
>> > Mar 25 22:08:12 alg kernel: [  598.301909]     ->snap 0xffff88022711f500
>> > Mar 25 22:08:12 alg kernel: [  598.301910]     ->offset 4492079104
>> > Mar 25 22:08:12 alg kernel: [  598.301911]     ->length 28672
>> > Mar 25 22:08:12 alg kernel: [  598.301912]     ->flags 0x1
>> > Mar 25 22:08:12 alg kernel: [  598.301913]     ->obj_request_count 0
>> > Mar 25 22:08:12 alg kernel: [  598.301913]     ->next_completion 2
>> > Mar 25 22:08:12 alg kernel: [  598.301914]     ->xferred 28672
>> > Mar 25 22:08:12 alg kernel: [  598.301915]     ->result 0
>> > Mar 25 22:08:12 alg kernel: [  598.301916]
>> > Mar 25 22:08:12 alg kernel: [  598.301916] Assertion failure in rbd_img_obj_callback() at line 2159:
>> > Mar 25 22:08:12 alg kernel: [  598.301916]
>> > Mar 25 22:08:12 alg kernel: [  598.301916]      rbd_assert(0);
>> > Mar 25 22:08:12 alg kernel: [  598.301916]
>> > Mar 25 22:08:12 alg kernel: [  598.302093] ------------[ cut here ]------------
>>
>> Hmm...  Can you attach your drivers/block/rbd.c as of this third
>> occurence?
>>
>> Thanks,
>>
>>                 Ilya
>>
>
> Of course.

Sorry, I asked for the drivers/block/rbd.c C source file.

Thanks,

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Olivier Bonvalet March 25, 2014, 9:41 p.m. UTC | #14

Le mardi 25 mars 2014 à 23:25 +0200, Ilya Dryomov a écrit :
> On Tue, Mar 25, 2014 at 11:22 PM, Olivier Bonvalet <ceph.list@daevel.fr> wrote:
> > Le mardi 25 mars 2014 à 23:20 +0200, Ilya Dryomov a écrit :
> >> On Tue, Mar 25, 2014 at 11:10 PM, Olivier Bonvalet <ceph.list@daevel.fr> wrote:
> >> > The third (now with rbd_assort(0)) :
> >> >
> >> > Mar 25 22:08:12 alg kernel: [  598.301895] rbd_img_obj_callback: bad image object request information:
> >> > Mar 25 22:08:12 alg kernel: [  598.301900] obj_request ffff88022409e1b8
> >> > Mar 25 22:08:12 alg kernel: [  598.301901]     ->object_name <(null)>
> >> > Mar 25 22:08:12 alg kernel: [  598.301902]     ->offset 0
> >> > Mar 25 22:08:12 alg kernel: [  598.301903]     ->length 8192
> >> > Mar 25 22:08:12 alg kernel: [  598.301904]     ->type 0x1
> >> > Mar 25 22:08:12 alg kernel: [  598.301905]     ->flags 0x3
> >> > Mar 25 22:08:12 alg kernel: [  598.301906]     ->which 4294967295
> >> > Mar 25 22:08:12 alg kernel: [  598.301906]     ->xferred 8192
> >> > Mar 25 22:08:12 alg kernel: [  598.301907]     ->result 0
> >> > Mar 25 22:08:12 alg kernel: [  598.301908] img_request ffff8802303bff10
> >> > Mar 25 22:08:12 alg kernel: [  598.301909]     ->snap 0xffff88022711f500
> >> > Mar 25 22:08:12 alg kernel: [  598.301910]     ->offset 4492079104
> >> > Mar 25 22:08:12 alg kernel: [  598.301911]     ->length 28672
> >> > Mar 25 22:08:12 alg kernel: [  598.301912]     ->flags 0x1
> >> > Mar 25 22:08:12 alg kernel: [  598.301913]     ->obj_request_count 0
> >> > Mar 25 22:08:12 alg kernel: [  598.301913]     ->next_completion 2
> >> > Mar 25 22:08:12 alg kernel: [  598.301914]     ->xferred 28672
> >> > Mar 25 22:08:12 alg kernel: [  598.301915]     ->result 0
> >> > Mar 25 22:08:12 alg kernel: [  598.301916]
> >> > Mar 25 22:08:12 alg kernel: [  598.301916] Assertion failure in rbd_img_obj_callback() at line 2159:
> >> > Mar 25 22:08:12 alg kernel: [  598.301916]
> >> > Mar 25 22:08:12 alg kernel: [  598.301916]      rbd_assert(0);
> >> > Mar 25 22:08:12 alg kernel: [  598.301916]
> >> > Mar 25 22:08:12 alg kernel: [  598.302093] ------------[ cut here ]------------
> >>
> >> Hmm...  Can you attach your drivers/block/rbd.c as of this third
> >> occurence?
> >>
> >> Thanks,
> >>
> >>                 Ilya
> >>
> >
> > Of course.
> 
> Sorry, I asked for the drivers/block/rbd.c C source file.
> 
> Thanks,
> 
>                 Ilya
> 

mmm the cluster seems to be in a really bad state now : all hosts are
hanging. Is it possible that mounting images without the rbd_assert(0)
broke some images ?


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ilya Dryomov March 25, 2014, 9:49 p.m. UTC | #15

On Tue, Mar 25, 2014 at 11:41 PM, Olivier Bonvalet <ceph.list@daevel.fr> wrote:
> mmm the cluster seems to be in a really bad state now : all hosts are
> hanging. Is it possible that mounting images without the rbd_assert(0)
> broke some images ?
>

I don't think so.  As far as I can tell all occurrences that you
reported tripped over one of the asserts.  It's probably just that for
some reason you are now hitting this bug much more frequently than once
a week.

Thanks,

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Olivier Bonvalet March 25, 2014, 9:54 p.m. UTC | #16

Le mardi 25 mars 2014 à 23:49 +0200, Ilya Dryomov a écrit :
> On Tue, Mar 25, 2014 at 11:41 PM, Olivier Bonvalet <ceph.list@daevel.fr> wrote:
> > mmm the cluster seems to be in a really bad state now : all hosts are
> > hanging. Is it possible that mounting images without the rbd_assert(0)
> > broke some images ?
> >
> 
> I don't think so.  As far as I can tell all occurrences that you
> reported tripped over one of the asserts.  It's probably just that for
> some reason you are now hitting this bug much more frequently than once
> a week.
> 
> Thanks,
> 
>                 Ilya
> --

Ok thanks, I'm «reassured».

At reboot VM are much more I/O loaded, because of cache flush. It's
probably the reason why it now hang often.

I have to wait a little between starting each VM.

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Olivier Bonvalet March 25, 2014, 10:17 p.m. UTC | #17

Le mardi 25 mars 2014 à 22:54 +0100, Olivier Bonvalet a écrit :
> Le mardi 25 mars 2014 à 23:49 +0200, Ilya Dryomov a écrit :
> > On Tue, Mar 25, 2014 at 11:41 PM, Olivier Bonvalet <ceph.list@daevel.fr> wrote:
> > > mmm the cluster seems to be in a really bad state now : all hosts are
> > > hanging. Is it possible that mounting images without the rbd_assert(0)
> > > broke some images ?
> > >
> > 
> > I don't think so.  As far as I can tell all occurrences that you
> > reported tripped over one of the asserts.  It's probably just that for
> > some reason you are now hitting this bug much more frequently than once
> > a week.
> > 
> > Thanks,
> > 
> >                 Ilya
> > --
> 
> Ok thanks, I'm «reassured».
> 
> At reboot VM are much more I/O loaded, because of cache flush. It's
> probably the reason why it now hang often.
> 
> I have to wait a little between starting each VM.
> 
> --

I now have this one very often (here 5 minutes after the host boot) :

Mar 25 23:14:45 rurkh kernel: [  330.054196] rbd_img_obj_callback: bad image object request information:
Mar 25 23:14:45 rurkh kernel: [  330.054205] obj_request ffff88025f3df058
Mar 25 23:14:45 rurkh kernel: [  330.054209]     ->object_name <(null)>
Mar 25 23:14:45 rurkh kernel: [  330.054211]     ->offset 0
Mar 25 23:14:45 rurkh kernel: [  330.054213]     ->length 4096
Mar 25 23:14:45 rurkh kernel: [  330.054216]     ->type 0x1
Mar 25 23:14:45 rurkh kernel: [  330.054218]     ->flags 0x3
Mar 25 23:14:45 rurkh kernel: [  330.054220]     ->which 4294967295
Mar 25 23:14:45 rurkh kernel: [  330.054222]     ->xferred 4096
Mar 25 23:14:45 rurkh kernel: [  330.054224]     ->result 0
Mar 25 23:14:45 rurkh kernel: [  330.054227] img_request ffff8802731f8448
Mar 25 23:14:45 rurkh kernel: [  330.054229]     ->snap 0xfffffffffffffffe
Mar 25 23:14:45 rurkh kernel: [  330.054231]     ->offset 2508181504
Mar 25 23:14:45 rurkh kernel: [  330.054233]     ->length 16384
Mar 25 23:14:45 rurkh kernel: [  330.054235]     ->flags 0x0
Mar 25 23:14:45 rurkh kernel: [  330.054237]     ->obj_request_count 0
Mar 25 23:14:45 rurkh kernel: [  330.054239]     ->next_completion 2
Mar 25 23:14:45 rurkh kernel: [  330.054241]     ->xferred 16384
Mar 25 23:14:45 rurkh kernel: [  330.054243]     ->result 0
Mar 25 23:14:45 rurkh kernel: [  330.054247] 
Mar 25 23:14:45 rurkh kernel: [  330.054247] Assertion failure in rbd_img_obj_callback() at line 2159:
Mar 25 23:14:45 rurkh kernel: [  330.054247] 
Mar 25 23:14:45 rurkh kernel: [  330.054247] 	rbd_assert(0);
Mar 25 23:14:45 rurkh kernel: [  330.054247] 
Mar 25 23:14:45 rurkh kernel: [  330.054495] ------------[ cut here ]------------
Mar 25 23:14:45 rurkh kernel: [  330.054585] kernel BUG at drivers/block/rbd.c:2159!
Mar 25 23:14:45 rurkh kernel: [  330.054676] invalid opcode: 0000 [#1] SMP 
Mar 25 23:14:45 rurkh kernel: [  330.054874] Modules linked in: cbc rbd libceph xen_gntdev xt_physdev iptable_filter ip_tables x_tables xfs libcrc32c bridge loop iTCO_wdt gpio_ich iTCO_vendor_support serio_raw sb_edac edac_core evdev i2c_i801 lpc_ich mfd_core ioatdma shpchp wmi ipmi_si ipmi_msghandler ac button dm_mod hid_generic usbhid hid sg sd_mod crc_t10dif crct10dif_common megaraid_sas isci ahci libsas libahci libata scsi_transport_sas ehci_pci ehci_hcd scsi_mod usbcore igb usb_common i2c_algo_bit ixgbe i2c_core dca ptp pps_core mdio
Mar 25 23:14:45 rurkh kernel: [  330.058433] CPU: 2 PID: 6365 Comm: kworker/2:3 Not tainted 3.13-dae-dom0 #22
Mar 25 23:14:45 rurkh kernel: [  330.058528] Hardware name: Supermicro X9DRW-7TPF+/X9DRW-7TPF+, BIOS 3.0 07/24/2013
Mar 25 23:14:45 rurkh kernel: [  330.058659] Workqueue: ceph-msgr con_work [libceph]
Mar 25 23:14:45 rurkh kernel: [  330.058805] task: ffff88026da5b820 ti: ffff88025dfe2000 task.ti: ffff88025dfe2000
Mar 25 23:14:45 rurkh kernel: [  330.058922] RIP: e030:[<ffffffffa0309cd9>]  [<ffffffffa0309cd9>] rbd_img_obj_callback+0x282/0x523 [rbd]
Mar 25 23:14:45 rurkh kernel: [  330.059107] RSP: e02b:ffff88025dfe3ce8  EFLAGS: 00010082
Mar 25 23:14:45 rurkh kernel: [  330.059199] RAX: 000000000000004c RBX: ffff88025f3df058 RCX: 0000000000000007
Mar 25 23:14:45 rurkh kernel: [  330.059300] RDX: 0000000000000006 RSI: 0000000000000000 RDI: ffff88025dfe00a8
Mar 25 23:14:45 rurkh kernel: [  330.059397] RBP: ffff8802731f8448 R08: 0000000000000000 R09: 0000000000000000
Mar 25 23:14:45 rurkh kernel: [  330.059491] R10: 0000000000000000 R11: ffff88025f712d66 R12: 0000000000000001
Mar 25 23:14:45 rurkh kernel: [  330.059587] R13: 0000000000000000 R14: ffff88025f712ad0 R15: 0000000000000000
Mar 25 23:14:45 rurkh kernel: [  330.059689] FS:  00007f2fd8882700(0000) GS:ffff88027fe40000(0000) knlGS:0000000000000000
Mar 25 23:14:45 rurkh kernel: [  330.059807] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 25 23:14:45 rurkh kernel: [  330.059899] CR2: 00007f7a1e28f000 CR3: 000000000160c000 CR4: 0000000000042660
Mar 25 23:14:45 rurkh kernel: [  330.059997] Stack:
Mar 25 23:14:45 rurkh kernel: [  330.060086]  ffff8802731f8484 ffff8802730f2c45 ffffffffffffffff ffff8802730f2c10
Mar 25 23:14:45 rurkh kernel: [  330.060339]  ffff88025f712ac8 ffff8802703b4718 0000000000000000 ffff88025f712ad0
Mar 25 23:14:45 rurkh kernel: [  330.060573]  0000000000000000 ffffffffa02f5595 0000000000000015 ffff8802703b4770
Mar 25 23:14:45 rurkh kernel: [  330.060811] Call Trace:
Mar 25 23:14:45 rurkh kernel: [  330.060878]  [<ffffffffa02f5595>] ? dispatch+0x3e4/0x55e [libceph]
Mar 25 23:14:45 rurkh kernel: [  330.060954]  [<ffffffffa02f00fc>] ? con_work+0xf6e/0x1a65 [libceph]
Mar 25 23:14:45 rurkh kernel: [  330.061029]  [<ffffffff81051f83>] ? mmdrop+0xd/0x1c
Mar 25 23:14:45 rurkh kernel: [  330.061098]  [<ffffffff8105265e>] ? finish_task_switch+0x4d/0x83
Mar 25 23:14:45 rurkh kernel: [  330.061171]  [<ffffffff810484d7>] ? process_one_work+0x15a/0x214
Mar 25 23:14:45 rurkh kernel: [  330.061243]  [<ffffffff8104895b>] ? worker_thread+0x139/0x1de
Mar 25 23:14:45 rurkh kernel: [  330.061313]  [<ffffffff81048822>] ? rescuer_thread+0x26e/0x26e
Mar 25 23:14:45 rurkh kernel: [  330.061385]  [<ffffffff8104cff6>] ? kthread+0x9e/0xa6
Mar 25 23:14:45 rurkh kernel: [  330.061454]  [<ffffffff8104cf58>] ? __kthread_parkme+0x55/0x55
Mar 25 23:14:45 rurkh kernel: [  330.061530]  [<ffffffff8137260c>] ? ret_from_fork+0x7c/0xb0
Mar 25 23:14:45 rurkh kernel: [  330.061606]  [<ffffffff8104cf58>] ? __kthread_parkme+0x55/0x55
Mar 25 23:14:45 rurkh kernel: [  330.061677] Code: cc 30 a0 31 c0 e8 8b e4 05 e1 48 c7 c1 5c cd 30 a0 31 c0 ba 6f 08 00 00 48 c7 c6 80 da 30 a0 48 c7 c7 1f c1 30 a0 e8 6a e4 05 e1 <0f> 0b 41 8b 45 5c ff c8 39 43 40 41 0f 92 c5 48 8b 5b 30 41 ff 
Mar 25 23:14:45 rurkh kernel: [  330.064345] RIP  [<ffffffffa0309cd9>] rbd_img_obj_callback+0x282/0x523 [rbd]
Mar 25 23:14:45 rurkh kernel: [  330.064481]  RSP <ffff88025dfe3ce8>
Mar 25 23:14:45 rurkh kernel: [  330.064562] ---[ end trace 74103a003e0d553e ]---


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alex Elder March 25, 2014, 10:46 p.m. UTC | #18

On 03/25/2014 05:17 PM, Olivier Bonvalet wrote:
> Le mardi 25 mars 2014 à 22:54 +0100, Olivier Bonvalet a écrit :
>> Le mardi 25 mars 2014 à 23:49 +0200, Ilya Dryomov a écrit :
>>> On Tue, Mar 25, 2014 at 11:41 PM, Olivier Bonvalet <ceph.list@daevel.fr> wrote:
>>>> mmm the cluster seems to be in a really bad state now : all hosts are
>>>> hanging. Is it possible that mounting images without the rbd_assert(0)
>>>> broke some images ?
>>>>
>>>
>>> I don't think so.  As far as I can tell all occurrences that you
>>> reported tripped over one of the asserts.  It's probably just that for
>>> some reason you are now hitting this bug much more frequently than once
>>> a week.
>>>
>>> Thanks,
>>>
>>>                 Ilya
>>> --
>>
>> Ok thanks, I'm «reassured».
>>
>> At reboot VM are much more I/O loaded, because of cache flush. It's
>> probably the reason why it now hang often.
>>
>> I have to wait a little between starting each VM.
>>
>> --
> 
> I now have this one very often (here 5 minutes after the host boot) :

I am fairly sure this indicates a use-after-free scenario,
likely caused by something getting deleted before every
user was done with it.

I believe Ilya is done for the night; I'm going to spend some
time looking at this to try to determine the cause.  If you
are willing I'd love to have you try whatever fix I come up
with.  I'd rather find a fix than just collect more information,
but I may need to get more, we'll see.

Thank you for all your reports, they help a lot.

					-Alex

> 
> Mar 25 23:14:45 rurkh kernel: [  330.054196] rbd_img_obj_callback: bad image object request information:
> Mar 25 23:14:45 rurkh kernel: [  330.054205] obj_request ffff88025f3df058
> Mar 25 23:14:45 rurkh kernel: [  330.054209]     ->object_name <(null)>
> Mar 25 23:14:45 rurkh kernel: [  330.054211]     ->offset 0
> Mar 25 23:14:45 rurkh kernel: [  330.054213]     ->length 4096
> Mar 25 23:14:45 rurkh kernel: [  330.054216]     ->type 0x1
> Mar 25 23:14:45 rurkh kernel: [  330.054218]     ->flags 0x3
> Mar 25 23:14:45 rurkh kernel: [  330.054220]     ->which 4294967295
> Mar 25 23:14:45 rurkh kernel: [  330.054222]     ->xferred 4096
> Mar 25 23:14:45 rurkh kernel: [  330.054224]     ->result 0
> Mar 25 23:14:45 rurkh kernel: [  330.054227] img_request ffff8802731f8448
> Mar 25 23:14:45 rurkh kernel: [  330.054229]     ->snap 0xfffffffffffffffe
> Mar 25 23:14:45 rurkh kernel: [  330.054231]     ->offset 2508181504
> Mar 25 23:14:45 rurkh kernel: [  330.054233]     ->length 16384
> Mar 25 23:14:45 rurkh kernel: [  330.054235]     ->flags 0x0
> Mar 25 23:14:45 rurkh kernel: [  330.054237]     ->obj_request_count 0
> Mar 25 23:14:45 rurkh kernel: [  330.054239]     ->next_completion 2
> Mar 25 23:14:45 rurkh kernel: [  330.054241]     ->xferred 16384
> Mar 25 23:14:45 rurkh kernel: [  330.054243]     ->result 0
> Mar 25 23:14:45 rurkh kernel: [  330.054247] 
> Mar 25 23:14:45 rurkh kernel: [  330.054247] Assertion failure in rbd_img_obj_callback() at line 2159:
> Mar 25 23:14:45 rurkh kernel: [  330.054247] 
> Mar 25 23:14:45 rurkh kernel: [  330.054247] 	rbd_assert(0);
> Mar 25 23:14:45 rurkh kernel: [  330.054247] 
> Mar 25 23:14:45 rurkh kernel: [  330.054495] ------------[ cut here ]------------
> Mar 25 23:14:45 rurkh kernel: [  330.054585] kernel BUG at drivers/block/rbd.c:2159!
> Mar 25 23:14:45 rurkh kernel: [  330.054676] invalid opcode: 0000 [#1] SMP 
> Mar 25 23:14:45 rurkh kernel: [  330.054874] Modules linked in: cbc rbd libceph xen_gntdev xt_physdev iptable_filter ip_tables x_tables xfs libcrc32c bridge loop iTCO_wdt gpio_ich iTCO_vendor_support serio_raw sb_edac edac_core evdev i2c_i801 lpc_ich mfd_core ioatdma shpchp wmi ipmi_si ipmi_msghandler ac button dm_mod hid_generic usbhid hid sg sd_mod crc_t10dif crct10dif_common megaraid_sas isci ahci libsas libahci libata scsi_transport_sas ehci_pci ehci_hcd scsi_mod usbcore igb usb_common i2c_algo_bit ixgbe i2c_core dca ptp pps_core mdio
> Mar 25 23:14:45 rurkh kernel: [  330.058433] CPU: 2 PID: 6365 Comm: kworker/2:3 Not tainted 3.13-dae-dom0 #22
> Mar 25 23:14:45 rurkh kernel: [  330.058528] Hardware name: Supermicro X9DRW-7TPF+/X9DRW-7TPF+, BIOS 3.0 07/24/2013
> Mar 25 23:14:45 rurkh kernel: [  330.058659] Workqueue: ceph-msgr con_work [libceph]
> Mar 25 23:14:45 rurkh kernel: [  330.058805] task: ffff88026da5b820 ti: ffff88025dfe2000 task.ti: ffff88025dfe2000
> Mar 25 23:14:45 rurkh kernel: [  330.058922] RIP: e030:[<ffffffffa0309cd9>]  [<ffffffffa0309cd9>] rbd_img_obj_callback+0x282/0x523 [rbd]
> Mar 25 23:14:45 rurkh kernel: [  330.059107] RSP: e02b:ffff88025dfe3ce8  EFLAGS: 00010082
> Mar 25 23:14:45 rurkh kernel: [  330.059199] RAX: 000000000000004c RBX: ffff88025f3df058 RCX: 0000000000000007
> Mar 25 23:14:45 rurkh kernel: [  330.059300] RDX: 0000000000000006 RSI: 0000000000000000 RDI: ffff88025dfe00a8
> Mar 25 23:14:45 rurkh kernel: [  330.059397] RBP: ffff8802731f8448 R08: 0000000000000000 R09: 0000000000000000
> Mar 25 23:14:45 rurkh kernel: [  330.059491] R10: 0000000000000000 R11: ffff88025f712d66 R12: 0000000000000001
> Mar 25 23:14:45 rurkh kernel: [  330.059587] R13: 0000000000000000 R14: ffff88025f712ad0 R15: 0000000000000000
> Mar 25 23:14:45 rurkh kernel: [  330.059689] FS:  00007f2fd8882700(0000) GS:ffff88027fe40000(0000) knlGS:0000000000000000
> Mar 25 23:14:45 rurkh kernel: [  330.059807] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> Mar 25 23:14:45 rurkh kernel: [  330.059899] CR2: 00007f7a1e28f000 CR3: 000000000160c000 CR4: 0000000000042660
> Mar 25 23:14:45 rurkh kernel: [  330.059997] Stack:
> Mar 25 23:14:45 rurkh kernel: [  330.060086]  ffff8802731f8484 ffff8802730f2c45 ffffffffffffffff ffff8802730f2c10
> Mar 25 23:14:45 rurkh kernel: [  330.060339]  ffff88025f712ac8 ffff8802703b4718 0000000000000000 ffff88025f712ad0
> Mar 25 23:14:45 rurkh kernel: [  330.060573]  0000000000000000 ffffffffa02f5595 0000000000000015 ffff8802703b4770
> Mar 25 23:14:45 rurkh kernel: [  330.060811] Call Trace:
> Mar 25 23:14:45 rurkh kernel: [  330.060878]  [<ffffffffa02f5595>] ? dispatch+0x3e4/0x55e [libceph]
> Mar 25 23:14:45 rurkh kernel: [  330.060954]  [<ffffffffa02f00fc>] ? con_work+0xf6e/0x1a65 [libceph]
> Mar 25 23:14:45 rurkh kernel: [  330.061029]  [<ffffffff81051f83>] ? mmdrop+0xd/0x1c
> Mar 25 23:14:45 rurkh kernel: [  330.061098]  [<ffffffff8105265e>] ? finish_task_switch+0x4d/0x83
> Mar 25 23:14:45 rurkh kernel: [  330.061171]  [<ffffffff810484d7>] ? process_one_work+0x15a/0x214
> Mar 25 23:14:45 rurkh kernel: [  330.061243]  [<ffffffff8104895b>] ? worker_thread+0x139/0x1de
> Mar 25 23:14:45 rurkh kernel: [  330.061313]  [<ffffffff81048822>] ? rescuer_thread+0x26e/0x26e
> Mar 25 23:14:45 rurkh kernel: [  330.061385]  [<ffffffff8104cff6>] ? kthread+0x9e/0xa6
> Mar 25 23:14:45 rurkh kernel: [  330.061454]  [<ffffffff8104cf58>] ? __kthread_parkme+0x55/0x55
> Mar 25 23:14:45 rurkh kernel: [  330.061530]  [<ffffffff8137260c>] ? ret_from_fork+0x7c/0xb0
> Mar 25 23:14:45 rurkh kernel: [  330.061606]  [<ffffffff8104cf58>] ? __kthread_parkme+0x55/0x55
> Mar 25 23:14:45 rurkh kernel: [  330.061677] Code: cc 30 a0 31 c0 e8 8b e4 05 e1 48 c7 c1 5c cd 30 a0 31 c0 ba 6f 08 00 00 48 c7 c6 80 da 30 a0 48 c7 c7 1f c1 30 a0 e8 6a e4 05 e1 <0f> 0b 41 8b 45 5c ff c8 39 43 40 41 0f 92 c5 48 8b 5b 30 41 ff 
> Mar 25 23:14:45 rurkh kernel: [  330.064345] RIP  [<ffffffffa0309cd9>] rbd_img_obj_callback+0x282/0x523 [rbd]
> Mar 25 23:14:45 rurkh kernel: [  330.064481]  RSP <ffff88025dfe3ce8>
> Mar 25 23:14:45 rurkh kernel: [  330.064562] ---[ end trace 74103a003e0d553e ]---
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Olivier Bonvalet March 25, 2014, 11:04 p.m. UTC | #19

Le mardi 25 mars 2014 à 17:46 -0500, Alex Elder a écrit :
> On 03/25/2014 05:17 PM, Olivier Bonvalet wrote:
> > 
> > I now have this one very often (here 5 minutes after the host boot) :
> 
> I am fairly sure this indicates a use-after-free scenario,
> likely caused by something getting deleted before every
> user was done with it.
> 
> I believe Ilya is done for the night; I'm going to spend some
> time looking at this to try to determine the cause.  If you
> are willing I'd love to have you try whatever fix I come up
> with.  I'd rather find a fix than just collect more information,
> but I may need to get more, we'll see.
> 
> Thank you for all your reports, they help a lot.
> 
> 					-Alex

Ok. I can apply some patch to help debug that yes.
I will try to reproduce on a different host, without customer data.

But I think I will stop here for the night too.

Thanks for your time,
Olivier

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alex Elder March 26, 2014, 2:54 a.m. UTC | #20

OK, here's some more analysis of this one.  I'm pretty
convinced the object request in question has already
been freed/destroyed.  The real question is "why?"

On 03/25/2014 05:17 PM, Olivier Bonvalet wrote:
> Mar 25 23:14:45 rurkh kernel: [  330.054196] rbd_img_obj_callback: bad image object request information:
> Mar 25 23:14:45 rurkh kernel: [  330.054205] obj_request ffff88025f3df058
> Mar 25 23:14:45 rurkh kernel: [  330.054209]     ->object_name <(null)>
This can occur for an object request that has been destroyed.
After the name is freed, the field is set to NULL.

> Mar 25 23:14:45 rurkh kernel: [  330.054211]     ->offset 0
> Mar 25 23:14:45 rurkh kernel: [  330.054213]     ->length 4096
Given the image request, this makes sense as the second of
two object request in this image request.  (More on the offsets
below).

> Mar 25 23:14:45 rurkh kernel: [  330.054216]     ->type 0x1
This is OBJ_REQUEST_BIO.  Reasonable.

> Mar 25 23:14:45 rurkh kernel: [  330.054218]     ->flags 0x3
This means IMG_DATA and DONE are set.  It's an image object
(which is expected), and it's done (which is also expected,
because we're calling the callback function).

> Mar 25 23:14:45 rurkh kernel: [  330.054220]     ->which 4294967295
This is BAD_WHICH.  That value overwrites the original when
an image object request is removed from its image in
rbd_img_obj_request_del().

> Mar 25 23:14:45 rurkh kernel: [  330.054222]     ->xferred 4096
> Mar 25 23:14:45 rurkh kernel: [  330.054224]     ->result 0
> Mar 25 23:14:45 rurkh kernel: [  330.054227] img_request ffff8802731f8448
> Mar 25 23:14:45 rurkh kernel: [  330.054229]     ->snap 0xfffffffffffffffe
This is CEPH_NOSNAP.

> Mar 25 23:14:45 rurkh kernel: [  330.054231]     ->offset 2508181504
> Mar 25 23:14:45 rurkh kernel: [  330.054233]     ->length 16384
Offset is 0x957FD000, length 0x4000,  The first object request
should be (assuming 4MB objects) object 255, offset 0x3FD000,
length 0x3000 (or 12KB)  The second object request should be in
object 256, offset 0, length 0x1000 (or 4KB).

> Mar 25 23:14:45 rurkh kernel: [  330.054235]     ->flags 0x0
It is not layered nor initiated by a child image.  It was a read.

> Mar 25 23:14:45 rurkh kernel: [  330.054237]     ->obj_request_count 0
As objects are removed from an image request, the request count
is decremented until it's zero.  This suggests this image request
(which looks for the most part valid) has been destroyed (or at
least is in the process).

> Mar 25 23:14:45 rurkh kernel: [  330.054239]     ->next_completion 2
For an image request comprised of 2 object requests, this is
the right value when both of the object requests have completed.
The next completion (which will never occur) equals the object
request count (2).

> Mar 25 23:14:45 rurkh kernel: [  330.054241]     ->xferred 16384
This means the request transferred completely...

> Mar 25 23:14:45 rurkh kernel: [  330.054243]     ->result 0
...and was successful.

So, it looks like the image request completed successfully.  The
first and second object requests must have both passed through
rbd_img_obj_callback() in order for this to occur, and both of
them transferred all their data successfully.

However, at the time of this assertion failure, the callback
is being called for the already-completed second object
request.  Why?

The only place an object request's callback is called is in
rbd_obj_request_complete().  Olivier reports that these are
version 1 images and there is no layering involved.  So the
only place rbd_obj_request_complete() is called is from
rbd_osd_req_callback(), and then only if the object request
has been marked done by (in this case) rbd_osd_read_callback().
For a non-layered image object request, that always occurs
(after possibly zeroing all or part of the buffer).  So it
makes sense that these OSD read requests completed normally,
and triggered object request completion, and when both of
them had completed they caused the image request to complete.

This suggests that rbd_osd_req_callback() got called more
than once for that second object request.  That function is
called only via ceph_osd_request->r_callback(), and that
only occurs in net/ceph/osd_client.c:handle_reply().  It
only calls it if local variable already_completed is false.
That's set to the osd request's r_got_reply value, taken
while the OSD client' request mutex is held.

Any chance the osd client can be calling r_callback twice?

A second possibility is that the last object request
got duplicated somehow.  Right now I'm looking at
rbd_img_request_fill() and bio_chain_clone_range() to
see if I can see a way that could happen.

That's it for now.  I'm pretty sure this is getting
pretty close to the root cause.

					-Alex
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Olivier Bonvalet March 26, 2014, 3:58 a.m. UTC | #21

Le mardi 25 mars 2014 à 21:54 -0500, Alex Elder a écrit :
> OK, here's some more analysis of this one.  I'm pretty
> convinced the object request in question has already
> been freed/destroyed.  The real question is "why?"
> 
> On 03/25/2014 05:17 PM, Olivier Bonvalet wrote:
> > Mar 25 23:14:45 rurkh kernel: [  330.054196] rbd_img_obj_callback: bad image object request information:
> > Mar 25 23:14:45 rurkh kernel: [  330.054205] obj_request ffff88025f3df058
> > Mar 25 23:14:45 rurkh kernel: [  330.054209]     ->object_name <(null)>
> This can occur for an object request that has been destroyed.
> After the name is freed, the field is set to NULL.
> 
> > Mar 25 23:14:45 rurkh kernel: [  330.054211]     ->offset 0
> > Mar 25 23:14:45 rurkh kernel: [  330.054213]     ->length 4096
> Given the image request, this makes sense as the second of
> two object request in this image request.  (More on the offsets
> below).
> 
> > Mar 25 23:14:45 rurkh kernel: [  330.054216]     ->type 0x1
> This is OBJ_REQUEST_BIO.  Reasonable.
> 
> > Mar 25 23:14:45 rurkh kernel: [  330.054218]     ->flags 0x3
> This means IMG_DATA and DONE are set.  It's an image object
> (which is expected), and it's done (which is also expected,
> because we're calling the callback function).
> 
> > Mar 25 23:14:45 rurkh kernel: [  330.054220]     ->which 4294967295
> This is BAD_WHICH.  That value overwrites the original when
> an image object request is removed from its image in
> rbd_img_obj_request_del().
> 
> > Mar 25 23:14:45 rurkh kernel: [  330.054222]     ->xferred 4096
> > Mar 25 23:14:45 rurkh kernel: [  330.054224]     ->result 0
> > Mar 25 23:14:45 rurkh kernel: [  330.054227] img_request ffff8802731f8448
> > Mar 25 23:14:45 rurkh kernel: [  330.054229]     ->snap 0xfffffffffffffffe
> This is CEPH_NOSNAP.
> 
> > Mar 25 23:14:45 rurkh kernel: [  330.054231]     ->offset 2508181504
> > Mar 25 23:14:45 rurkh kernel: [  330.054233]     ->length 16384
> Offset is 0x957FD000, length 0x4000,  The first object request
> should be (assuming 4MB objects) object 255, offset 0x3FD000,
> length 0x3000 (or 12KB)  The second object request should be in
> object 256, offset 0, length 0x1000 (or 4KB).
> 
> > Mar 25 23:14:45 rurkh kernel: [  330.054235]     ->flags 0x0
> It is not layered nor initiated by a child image.  It was a read.
> 
> > Mar 25 23:14:45 rurkh kernel: [  330.054237]     ->obj_request_count 0
> As objects are removed from an image request, the request count
> is decremented until it's zero.  This suggests this image request
> (which looks for the most part valid) has been destroyed (or at
> least is in the process).
> 
> > Mar 25 23:14:45 rurkh kernel: [  330.054239]     ->next_completion 2
> For an image request comprised of 2 object requests, this is
> the right value when both of the object requests have completed.
> The next completion (which will never occur) equals the object
> request count (2).
> 
> > Mar 25 23:14:45 rurkh kernel: [  330.054241]     ->xferred 16384
> This means the request transferred completely...
> 
> > Mar 25 23:14:45 rurkh kernel: [  330.054243]     ->result 0
> ...and was successful.
> 
> 
> So, it looks like the image request completed successfully.  The
> first and second object requests must have both passed through
> rbd_img_obj_callback() in order for this to occur, and both of
> them transferred all their data successfully.
> 
> However, at the time of this assertion failure, the callback
> is being called for the already-completed second object
> request.  Why?
> 
> The only place an object request's callback is called is in
> rbd_obj_request_complete().  Olivier reports that these are
> version 1 images and there is no layering involved.  So the
> only place rbd_obj_request_complete() is called is from
> rbd_osd_req_callback(), and then only if the object request
> has been marked done by (in this case) rbd_osd_read_callback().
> For a non-layered image object request, that always occurs
> (after possibly zeroing all or part of the buffer).  So it
> makes sense that these OSD read requests completed normally,
> and triggered object request completion, and when both of
> them had completed they caused the image request to complete.
> 
> This suggests that rbd_osd_req_callback() got called more
> than once for that second object request.  That function is
> called only via ceph_osd_request->r_callback(), and that
> only occurs in net/ceph/osd_client.c:handle_reply().  It
> only calls it if local variable already_completed is false.
> That's set to the osd request's r_got_reply value, taken
> while the OSD client' request mutex is held.
> 
> Any chance the osd client can be calling r_callback twice?
> 
> A second possibility is that the last object request
> got duplicated somehow.  Right now I'm looking at
> rbd_img_request_fill() and bio_chain_clone_range() to
> see if I can see a way that could happen.
> 
> That's it for now.  I'm pretty sure this is getting
> pretty close to the root cause.
> 
> 					-Alex
> 

I'm pretty sure that the new high frequency of the crash is a part of
the problem. Before the spinlock fix, the problem was hard to trigger
(once per week, maybe per day on two hosts only). Now it's really hard
to *don't* trigger it : I now have 5 client servers, and only one of
them have more than one hour of uptime.

I can be wrong, since I don't know neither really understand the code,
but from what I understand, there was a "ghost" request, a duplicate,
that follow the good one. Before the spinlock fix, this second request
could be proceed if next_completion was not yet modified, so this "bug"
hadn't a big impact.

But now that next_completion is properly checked with the spinlock, that
ghost request is no more proceed neither ignored, it throws a kernel
bug.

From my point of view, since it seems to be an orphaned read request, it
should be track in logs (this request should not exists) but then it
should be ignored without locking the kernel.

From now, I switch on previous kernels (yes, I should do that some hours
sooner ;)).

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Issue #5876 : assertion failure in rbd_img_obj_callback()

Commit Message

Comments

Patch