[GSoC] wrong patch "msg: change entity_addr_t encode/decode"
diff mbox

Message ID alpine.DEB.2.11.1606021238150.6221@cpach.fuggernut.com
State New
Headers show

Commit Message

Sage Weil June 2, 2016, 5:04 p.m. UTC
Hi Zhao,

On Tue, 31 May 2016, Junwang Zhao wrote:
> Hi Sage & Haomai,
> 
> I try to run './vstart.sh -d -x -n' to test the patches today, I got
> some error like 'wrong node', after some work, I found the
> error was introduced by this patch[0]. I change the code to
> use the legacy decode and encode, but got stuck somewhere.
> 
> --- a/src/msg/msg_types.h
> +++ b/src/msg/msg_types.h
> @@ -371,7 +371,7 @@ struct entity_addr_t {
>    // broader study
> 
>    void encode(bufferlist& bl, uint64_t features) const {
> -    if ((features & CEPH_FEATURE_MSG_ADDR2) == 0) {
> +    if ((features & CEPH_FEATURE_MSG_ADDR2) != 0) {
>        ::encode((__u32)0, bl);
>        ::encode(nonce, bl);
>        sockaddr_storage ss = get_sockaddr_storage();
> 
> 
> Can you please see the patch again and give some comments?
> 
> [0]https://github.com/zhjwpku/ceph/commit/8934f038fdf28156038770e601e4a27f7a684390

This was enough to make things work fo rme:


but I think this is not the right fix.  I think we still want 
TYPE_NONE, TYPE_LEGACY, and (soon) TYPE_MSGR2, and that the core problem 
is places that are creating addrs aren't setting the type properly.  Going 
to play with this a bit...

sage

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Sage Weil June 2, 2016, 5:46 p.m. UTC | #1
On Thu, 2 Jun 2016, Sage Weil wrote:
> Hi Zhao,
> 
> On Tue, 31 May 2016, Junwang Zhao wrote:
> > Hi Sage & Haomai,
> > 
> > I try to run './vstart.sh -d -x -n' to test the patches today, I got
> > some error like 'wrong node', after some work, I found the
> > error was introduced by this patch[0]. I change the code to
> > use the legacy decode and encode, but got stuck somewhere.
> > 
> > --- a/src/msg/msg_types.h
> > +++ b/src/msg/msg_types.h
> > @@ -371,7 +371,7 @@ struct entity_addr_t {
> >    // broader study
> > 
> >    void encode(bufferlist& bl, uint64_t features) const {
> > -    if ((features & CEPH_FEATURE_MSG_ADDR2) == 0) {
> > +    if ((features & CEPH_FEATURE_MSG_ADDR2) != 0) {
> >        ::encode((__u32)0, bl);
> >        ::encode(nonce, bl);
> >        sockaddr_storage ss = get_sockaddr_storage();
> > 
> > 
> > Can you please see the patch again and give some comments?
> > 
> > [0]https://github.com/zhjwpku/ceph/commit/8934f038fdf28156038770e601e4a27f7a684390
> 
> This was enough to make things work fo rme:
> 
> diff --git a/src/msg/msg_types.h b/src/msg/msg_types.h
> index 9c521e6..0da0614 100644
> --- a/src/msg/msg_types.h
> +++ b/src/msg/msg_types.h
> @@ -199,8 +199,7 @@ WRITE_CLASS_ENCODER(ceph_sockaddr_storage)
>  
>  struct entity_addr_t {
>    typedef enum {
> -    TYPE_NONE = 0,
> -    TYPE_LEGACY = 1,
> +    TYPE_LEGACY = 0,
>    } type_t;
>  
>    __u32 type;
> @@ -211,7 +210,7 @@ struct entity_addr_t {
>      sockaddr_in6 sin6;
>    } u;
>  
> -  entity_addr_t() : type(0), nonce(0) { 
> +  entity_addr_t() : type(TYPE_LEGACY), nonce(0) {
>      memset(&u, 0, sizeof(u));
>    }
>    explicit entity_addr_t(const ceph_entity_addr &o) {
> 
> but I think this is not the right fix.  I think we still want 
> TYPE_NONE, TYPE_LEGACY, and (soon) TYPE_MSGR2, and that the core problem 
> is places that are creating addrs aren't setting the type properly.  Going 
> to play with this a bit...

I pushed a wip-addr-more branch:

	https://github.com/liewegas/ceph/commits/wip-addr-more

This fixes the parser to take type prefixes, default to legacy, and 
changes entity_addr_t() so that it renders are "-" instead of ":/0" (which 
was weird anyway).

I think what we need next are some unit tests in test/test_addrs.cc that 
do things like encode a entity_addr_t with no features, decode as 
entity_addrvec_t, or construct varous addrvecs, encode with no features, 
an ensure a decode to entity_addr_t gives back the right addr from the 
list.

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Hunter June 6, 2016, 8:28 a.m. UTC | #2
Hi, Sage,

Sorry I did not see your reply util now, I will work on it.

Thanks!
Zhao

On Fri, Jun 3, 2016 at 1:46 AM, Sage Weil <sage@newdream.net> wrote:
> On Thu, 2 Jun 2016, Sage Weil wrote:
>> Hi Zhao,
>>
>> On Tue, 31 May 2016, Junwang Zhao wrote:
>> > Hi Sage & Haomai,
>> >
>> > I try to run './vstart.sh -d -x -n' to test the patches today, I got
>> > some error like 'wrong node', after some work, I found the
>> > error was introduced by this patch[0]. I change the code to
>> > use the legacy decode and encode, but got stuck somewhere.
>> >
>> > --- a/src/msg/msg_types.h
>> > +++ b/src/msg/msg_types.h
>> > @@ -371,7 +371,7 @@ struct entity_addr_t {
>> >    // broader study
>> >
>> >    void encode(bufferlist& bl, uint64_t features) const {
>> > -    if ((features & CEPH_FEATURE_MSG_ADDR2) == 0) {
>> > +    if ((features & CEPH_FEATURE_MSG_ADDR2) != 0) {
>> >        ::encode((__u32)0, bl);
>> >        ::encode(nonce, bl);
>> >        sockaddr_storage ss = get_sockaddr_storage();
>> >
>> >
>> > Can you please see the patch again and give some comments?
>> >
>> > [0]https://github.com/zhjwpku/ceph/commit/8934f038fdf28156038770e601e4a27f7a684390
>>
>> This was enough to make things work fo rme:
>>
>> diff --git a/src/msg/msg_types.h b/src/msg/msg_types.h
>> index 9c521e6..0da0614 100644
>> --- a/src/msg/msg_types.h
>> +++ b/src/msg/msg_types.h
>> @@ -199,8 +199,7 @@ WRITE_CLASS_ENCODER(ceph_sockaddr_storage)
>>
>>  struct entity_addr_t {
>>    typedef enum {
>> -    TYPE_NONE = 0,
>> -    TYPE_LEGACY = 1,
>> +    TYPE_LEGACY = 0,
>>    } type_t;
>>
>>    __u32 type;
>> @@ -211,7 +210,7 @@ struct entity_addr_t {
>>      sockaddr_in6 sin6;
>>    } u;
>>
>> -  entity_addr_t() : type(0), nonce(0) {
>> +  entity_addr_t() : type(TYPE_LEGACY), nonce(0) {
>>      memset(&u, 0, sizeof(u));
>>    }
>>    explicit entity_addr_t(const ceph_entity_addr &o) {
>>
>> but I think this is not the right fix.  I think we still want
>> TYPE_NONE, TYPE_LEGACY, and (soon) TYPE_MSGR2, and that the core problem
>> is places that are creating addrs aren't setting the type properly.  Going
>> to play with this a bit...
>
> I pushed a wip-addr-more branch:
>
>         https://github.com/liewegas/ceph/commits/wip-addr-more
>
> This fixes the parser to take type prefixes, default to legacy, and
> changes entity_addr_t() so that it renders are "-" instead of ":/0" (which
> was weird anyway).
>
> I think what we need next are some unit tests in test/test_addrs.cc that
> do things like encode a entity_addr_t with no features, decode as
> entity_addrvec_t, or construct varous addrvecs, encode with no features,
> an ensure a decode to entity_addr_t gives back the right addr from the
> list.
>
> sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Hunter June 8, 2016, 1:30 p.m. UTC | #3
Hi Sage,

Write some unittest, but seems not right, see [0], give some comments
please, I use `cout` to print the value, but I don't know where to see them.

I got a lot of other "FAIL" in the wip-addr-more branch, like:
  ceph-detect-init/run-tox.sh
  ceph-disk/run-tox.sh
  test/run-rbd-unit-tests.sh
I am wondering is this because my working environment? My laptop
is really not very powerful, "i5-3210M CPU, 8G memory" :(
make check is quite slow.

[0] https://github.com/zhjwpku/ceph/commit/e15a240d56c2b67a8b0906dc80650a91bff1518f

On Mon, Jun 6, 2016 at 4:28 PM, Junwang Zhao <zhjwpku@gmail.com> wrote:
> Hi, Sage,
>
> Sorry I did not see your reply util now, I will work on it.
>
> Thanks!
> Zhao
>
> On Fri, Jun 3, 2016 at 1:46 AM, Sage Weil <sage@newdream.net> wrote:
>> On Thu, 2 Jun 2016, Sage Weil wrote:
>>> Hi Zhao,
>>>
>>> On Tue, 31 May 2016, Junwang Zhao wrote:
>>> > Hi Sage & Haomai,
>>> >
>>> > I try to run './vstart.sh -d -x -n' to test the patches today, I got
>>> > some error like 'wrong node', after some work, I found the
>>> > error was introduced by this patch[0]. I change the code to
>>> > use the legacy decode and encode, but got stuck somewhere.
>>> >
>>> > --- a/src/msg/msg_types.h
>>> > +++ b/src/msg/msg_types.h
>>> > @@ -371,7 +371,7 @@ struct entity_addr_t {
>>> >    // broader study
>>> >
>>> >    void encode(bufferlist& bl, uint64_t features) const {
>>> > -    if ((features & CEPH_FEATURE_MSG_ADDR2) == 0) {
>>> > +    if ((features & CEPH_FEATURE_MSG_ADDR2) != 0) {
>>> >        ::encode((__u32)0, bl);
>>> >        ::encode(nonce, bl);
>>> >        sockaddr_storage ss = get_sockaddr_storage();
>>> >
>>> >
>>> > Can you please see the patch again and give some comments?
>>> >
>>> > [0]https://github.com/zhjwpku/ceph/commit/8934f038fdf28156038770e601e4a27f7a684390
>>>
>>> This was enough to make things work fo rme:
>>>
>>> diff --git a/src/msg/msg_types.h b/src/msg/msg_types.h
>>> index 9c521e6..0da0614 100644
>>> --- a/src/msg/msg_types.h
>>> +++ b/src/msg/msg_types.h
>>> @@ -199,8 +199,7 @@ WRITE_CLASS_ENCODER(ceph_sockaddr_storage)
>>>
>>>  struct entity_addr_t {
>>>    typedef enum {
>>> -    TYPE_NONE = 0,
>>> -    TYPE_LEGACY = 1,
>>> +    TYPE_LEGACY = 0,
>>>    } type_t;
>>>
>>>    __u32 type;
>>> @@ -211,7 +210,7 @@ struct entity_addr_t {
>>>      sockaddr_in6 sin6;
>>>    } u;
>>>
>>> -  entity_addr_t() : type(0), nonce(0) {
>>> +  entity_addr_t() : type(TYPE_LEGACY), nonce(0) {
>>>      memset(&u, 0, sizeof(u));
>>>    }
>>>    explicit entity_addr_t(const ceph_entity_addr &o) {
>>>
>>> but I think this is not the right fix.  I think we still want
>>> TYPE_NONE, TYPE_LEGACY, and (soon) TYPE_MSGR2, and that the core problem
>>> is places that are creating addrs aren't setting the type properly.  Going
>>> to play with this a bit...
>>
>> I pushed a wip-addr-more branch:
>>
>>         https://github.com/liewegas/ceph/commits/wip-addr-more
>>
>> This fixes the parser to take type prefixes, default to legacy, and
>> changes entity_addr_t() so that it renders are "-" instead of ":/0" (which
>> was weird anyway).
>>
>> I think what we need next are some unit tests in test/test_addrs.cc that
>> do things like encode a entity_addr_t with no features, decode as
>> entity_addrvec_t, or construct varous addrvecs, encode with no features,
>> an ensure a decode to entity_addr_t gives back the right addr from the
>> list.
>>
>> sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sage Weil June 9, 2016, 4:10 p.m. UTC | #4
On Wed, 8 Jun 2016, Junwang Zhao wrote:
> Hi Sage,
> 
> Write some unittest, but seems not right, see [0], give some comments
> please, I use `cout` to print the value, but I don't know where to see them.
> 
> I got a lot of other "FAIL" in the wip-addr-more branch, like:
>   ceph-detect-init/run-tox.sh
>   ceph-disk/run-tox.sh
>   test/run-rbd-unit-tests.sh
> I am wondering is this because my working environment? My laptop
> is really not very powerful, "i5-3210M CPU, 8G memory" :(
> make check is quite slow.
> 
> [0] https://github.com/zhjwpku/ceph/commit/e15a240d56c2b67a8b0906dc80650a91bff1518f

I'm traveling today and not able to replicate this yet--I'll take a look 
tomorrow.

In the meantime, though, it doesn't look like your new tests are doing 
quite what I think we need them to.  I think the important cases are:

1- make sure that any encoded entity_addr_t can be decoded as a 
entity_addrvec_t, with size 1, where item 0 matches the addr.  Can use the 
existing arrays of addrs to parse to test this.

2- make sure that in all of those cases, if we encode the entity_addrvec_t 
with features 0, and decode as an entity_addr_t, we get back the same 
entity_addr_t.

3- the addrvec size > 1 cases:

- populate an addrvec with multiple addrs where one is legacy and others 
are not, and ensure that encoding with features 0 and decoding as an addr 
gives back the legacy item.

- multiple legacy addrs with features 0 should encode the first one in the 
list

- a vec with all non-legacy addrs encoded with features 0 and decoded as 
addr should give back an addr that is not entity_addr_t() but is 
unusable/bogus.

4- the empty addrvec case:

- encode with features 0 and decoding as addr should give back 
entity_addr_t().

I think that's it?

sage


> 
> On Mon, Jun 6, 2016 at 4:28 PM, Junwang Zhao <zhjwpku@gmail.com> wrote:
> > Hi, Sage,
> >
> > Sorry I did not see your reply util now, I will work on it.
> >
> > Thanks!
> > Zhao
> >
> > On Fri, Jun 3, 2016 at 1:46 AM, Sage Weil <sage@newdream.net> wrote:
> >> On Thu, 2 Jun 2016, Sage Weil wrote:
> >>> Hi Zhao,
> >>>
> >>> On Tue, 31 May 2016, Junwang Zhao wrote:
> >>> > Hi Sage & Haomai,
> >>> >
> >>> > I try to run './vstart.sh -d -x -n' to test the patches today, I got
> >>> > some error like 'wrong node', after some work, I found the
> >>> > error was introduced by this patch[0]. I change the code to
> >>> > use the legacy decode and encode, but got stuck somewhere.
> >>> >
> >>> > --- a/src/msg/msg_types.h
> >>> > +++ b/src/msg/msg_types.h
> >>> > @@ -371,7 +371,7 @@ struct entity_addr_t {
> >>> >    // broader study
> >>> >
> >>> >    void encode(bufferlist& bl, uint64_t features) const {
> >>> > -    if ((features & CEPH_FEATURE_MSG_ADDR2) == 0) {
> >>> > +    if ((features & CEPH_FEATURE_MSG_ADDR2) != 0) {
> >>> >        ::encode((__u32)0, bl);
> >>> >        ::encode(nonce, bl);
> >>> >        sockaddr_storage ss = get_sockaddr_storage();
> >>> >
> >>> >
> >>> > Can you please see the patch again and give some comments?
> >>> >
> >>> > [0]https://github.com/zhjwpku/ceph/commit/8934f038fdf28156038770e601e4a27f7a684390
> >>>
> >>> This was enough to make things work fo rme:
> >>>
> >>> diff --git a/src/msg/msg_types.h b/src/msg/msg_types.h
> >>> index 9c521e6..0da0614 100644
> >>> --- a/src/msg/msg_types.h
> >>> +++ b/src/msg/msg_types.h
> >>> @@ -199,8 +199,7 @@ WRITE_CLASS_ENCODER(ceph_sockaddr_storage)
> >>>
> >>>  struct entity_addr_t {
> >>>    typedef enum {
> >>> -    TYPE_NONE = 0,
> >>> -    TYPE_LEGACY = 1,
> >>> +    TYPE_LEGACY = 0,
> >>>    } type_t;
> >>>
> >>>    __u32 type;
> >>> @@ -211,7 +210,7 @@ struct entity_addr_t {
> >>>      sockaddr_in6 sin6;
> >>>    } u;
> >>>
> >>> -  entity_addr_t() : type(0), nonce(0) {
> >>> +  entity_addr_t() : type(TYPE_LEGACY), nonce(0) {
> >>>      memset(&u, 0, sizeof(u));
> >>>    }
> >>>    explicit entity_addr_t(const ceph_entity_addr &o) {
> >>>
> >>> but I think this is not the right fix.  I think we still want
> >>> TYPE_NONE, TYPE_LEGACY, and (soon) TYPE_MSGR2, and that the core problem
> >>> is places that are creating addrs aren't setting the type properly.  Going
> >>> to play with this a bit...
> >>
> >> I pushed a wip-addr-more branch:
> >>
> >>         https://github.com/liewegas/ceph/commits/wip-addr-more
> >>
> >> This fixes the parser to take type prefixes, default to legacy, and
> >> changes entity_addr_t() so that it renders are "-" instead of ":/0" (which
> >> was weird anyway).
> >>
> >> I think what we need next are some unit tests in test/test_addrs.cc that
> >> do things like encode a entity_addr_t with no features, decode as
> >> entity_addrvec_t, or construct varous addrvecs, encode with no features,
> >> an ensure a decode to entity_addr_t gives back the right addr from the
> >> list.
> >>
> >> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Hunter June 10, 2016, 2:37 a.m. UTC | #5
Hi Sage,

I have go a little bit further, the cases you mentioned really make sense,
please see [0] and make some comments, I am still not good at writting
the unittest.

[0]https://github.com/zhjwpku/ceph/commit/4da16085a650f1dd0b743560836fdf213507d3b4

Thanks!
Zhao

On Fri, Jun 10, 2016 at 12:10 AM, Sage Weil <sage@newdream.net> wrote:
> On Wed, 8 Jun 2016, Junwang Zhao wrote:
>> Hi Sage,
>>
>> Write some unittest, but seems not right, see [0], give some comments
>> please, I use `cout` to print the value, but I don't know where to see them.
>>
>> I got a lot of other "FAIL" in the wip-addr-more branch, like:
>>   ceph-detect-init/run-tox.sh
>>   ceph-disk/run-tox.sh
>>   test/run-rbd-unit-tests.sh
>> I am wondering is this because my working environment? My laptop
>> is really not very powerful, "i5-3210M CPU, 8G memory" :(
>> make check is quite slow.
>>
>> [0] https://github.com/zhjwpku/ceph/commit/e15a240d56c2b67a8b0906dc80650a91bff1518f
>
> I'm traveling today and not able to replicate this yet--I'll take a look
> tomorrow.
>
> In the meantime, though, it doesn't look like your new tests are doing
> quite what I think we need them to.  I think the important cases are:
>
> 1- make sure that any encoded entity_addr_t can be decoded as a
> entity_addrvec_t, with size 1, where item 0 matches the addr.  Can use the
> existing arrays of addrs to parse to test this.
>
> 2- make sure that in all of those cases, if we encode the entity_addrvec_t
> with features 0, and decode as an entity_addr_t, we get back the same
> entity_addr_t.
>
> 3- the addrvec size > 1 cases:
>
> - populate an addrvec with multiple addrs where one is legacy and others
> are not, and ensure that encoding with features 0 and decoding as an addr
> gives back the legacy item.
>
> - multiple legacy addrs with features 0 should encode the first one in the
> list
>
> - a vec with all non-legacy addrs encoded with features 0 and decoded as
> addr should give back an addr that is not entity_addr_t() but is
> unusable/bogus.
>
> 4- the empty addrvec case:
>
> - encode with features 0 and decoding as addr should give back
> entity_addr_t().
>
> I think that's it?
>
> sage
>
>
>>
>> On Mon, Jun 6, 2016 at 4:28 PM, Junwang Zhao <zhjwpku@gmail.com> wrote:
>> > Hi, Sage,
>> >
>> > Sorry I did not see your reply util now, I will work on it.
>> >
>> > Thanks!
>> > Zhao
>> >
>> > On Fri, Jun 3, 2016 at 1:46 AM, Sage Weil <sage@newdream.net> wrote:
>> >> On Thu, 2 Jun 2016, Sage Weil wrote:
>> >>> Hi Zhao,
>> >>>
>> >>> On Tue, 31 May 2016, Junwang Zhao wrote:
>> >>> > Hi Sage & Haomai,
>> >>> >
>> >>> > I try to run './vstart.sh -d -x -n' to test the patches today, I got
>> >>> > some error like 'wrong node', after some work, I found the
>> >>> > error was introduced by this patch[0]. I change the code to
>> >>> > use the legacy decode and encode, but got stuck somewhere.
>> >>> >
>> >>> > --- a/src/msg/msg_types.h
>> >>> > +++ b/src/msg/msg_types.h
>> >>> > @@ -371,7 +371,7 @@ struct entity_addr_t {
>> >>> >    // broader study
>> >>> >
>> >>> >    void encode(bufferlist& bl, uint64_t features) const {
>> >>> > -    if ((features & CEPH_FEATURE_MSG_ADDR2) == 0) {
>> >>> > +    if ((features & CEPH_FEATURE_MSG_ADDR2) != 0) {
>> >>> >        ::encode((__u32)0, bl);
>> >>> >        ::encode(nonce, bl);
>> >>> >        sockaddr_storage ss = get_sockaddr_storage();
>> >>> >
>> >>> >
>> >>> > Can you please see the patch again and give some comments?
>> >>> >
>> >>> > [0]https://github.com/zhjwpku/ceph/commit/8934f038fdf28156038770e601e4a27f7a684390
>> >>>
>> >>> This was enough to make things work fo rme:
>> >>>
>> >>> diff --git a/src/msg/msg_types.h b/src/msg/msg_types.h
>> >>> index 9c521e6..0da0614 100644
>> >>> --- a/src/msg/msg_types.h
>> >>> +++ b/src/msg/msg_types.h
>> >>> @@ -199,8 +199,7 @@ WRITE_CLASS_ENCODER(ceph_sockaddr_storage)
>> >>>
>> >>>  struct entity_addr_t {
>> >>>    typedef enum {
>> >>> -    TYPE_NONE = 0,
>> >>> -    TYPE_LEGACY = 1,
>> >>> +    TYPE_LEGACY = 0,
>> >>>    } type_t;
>> >>>
>> >>>    __u32 type;
>> >>> @@ -211,7 +210,7 @@ struct entity_addr_t {
>> >>>      sockaddr_in6 sin6;
>> >>>    } u;
>> >>>
>> >>> -  entity_addr_t() : type(0), nonce(0) {
>> >>> +  entity_addr_t() : type(TYPE_LEGACY), nonce(0) {
>> >>>      memset(&u, 0, sizeof(u));
>> >>>    }
>> >>>    explicit entity_addr_t(const ceph_entity_addr &o) {
>> >>>
>> >>> but I think this is not the right fix.  I think we still want
>> >>> TYPE_NONE, TYPE_LEGACY, and (soon) TYPE_MSGR2, and that the core problem
>> >>> is places that are creating addrs aren't setting the type properly.  Going
>> >>> to play with this a bit...
>> >>
>> >> I pushed a wip-addr-more branch:
>> >>
>> >>         https://github.com/liewegas/ceph/commits/wip-addr-more
>> >>
>> >> This fixes the parser to take type prefixes, default to legacy, and
>> >> changes entity_addr_t() so that it renders are "-" instead of ":/0" (which
>> >> was weird anyway).
>> >>
>> >> I think what we need next are some unit tests in test/test_addrs.cc that
>> >> do things like encode a entity_addr_t with no features, decode as
>> >> entity_addrvec_t, or construct varous addrvecs, encode with no features,
>> >> an ensure a decode to entity_addr_t gives back the right addr from the
>> >> list.
>> >>
>> >> sage
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Hunter June 10, 2016, 3:15 p.m. UTC | #6
Hi Sage,

Updated and pushed to another branch, see [0], the 'make check'
shows that the test cases FAILS, could you tell me how to debug
these unittest? I hope to see the result using "cout" but I don't know
where those output goes to :(

[0]https://github.com/zhjwpku/ceph/commit/e7307deae01fd6b675350fdae9c493eae04fdde9

Thanks,
Zhao

On Fri, Jun 10, 2016 at 10:37 AM, Junwang Zhao <zhjwpku@gmail.com> wrote:
> Hi Sage,
>
> I have go a little bit further, the cases you mentioned really make sense,
> please see [0] and make some comments, I am still not good at writting
> the unittest.
>
> [0]https://github.com/zhjwpku/ceph/commit/4da16085a650f1dd0b743560836fdf213507d3b4
>
> Thanks!
> Zhao
>
> On Fri, Jun 10, 2016 at 12:10 AM, Sage Weil <sage@newdream.net> wrote:
>> On Wed, 8 Jun 2016, Junwang Zhao wrote:
>>> Hi Sage,
>>>
>>> Write some unittest, but seems not right, see [0], give some comments
>>> please, I use `cout` to print the value, but I don't know where to see them.
>>>
>>> I got a lot of other "FAIL" in the wip-addr-more branch, like:
>>>   ceph-detect-init/run-tox.sh
>>>   ceph-disk/run-tox.sh
>>>   test/run-rbd-unit-tests.sh
>>> I am wondering is this because my working environment? My laptop
>>> is really not very powerful, "i5-3210M CPU, 8G memory" :(
>>> make check is quite slow.
>>>
>>> [0] https://github.com/zhjwpku/ceph/commit/e15a240d56c2b67a8b0906dc80650a91bff1518f
>>
>> I'm traveling today and not able to replicate this yet--I'll take a look
>> tomorrow.
>>
>> In the meantime, though, it doesn't look like your new tests are doing
>> quite what I think we need them to.  I think the important cases are:
>>
>> 1- make sure that any encoded entity_addr_t can be decoded as a
>> entity_addrvec_t, with size 1, where item 0 matches the addr.  Can use the
>> existing arrays of addrs to parse to test this.
>>
>> 2- make sure that in all of those cases, if we encode the entity_addrvec_t
>> with features 0, and decode as an entity_addr_t, we get back the same
>> entity_addr_t.
>>
>> 3- the addrvec size > 1 cases:
>>
>> - populate an addrvec with multiple addrs where one is legacy and others
>> are not, and ensure that encoding with features 0 and decoding as an addr
>> gives back the legacy item.
>>
>> - multiple legacy addrs with features 0 should encode the first one in the
>> list
>>
>> - a vec with all non-legacy addrs encoded with features 0 and decoded as
>> addr should give back an addr that is not entity_addr_t() but is
>> unusable/bogus.
>>
>> 4- the empty addrvec case:
>>
>> - encode with features 0 and decoding as addr should give back
>> entity_addr_t().
>>
>> I think that's it?
>>
>> sage
>>
>>
>>>
>>> On Mon, Jun 6, 2016 at 4:28 PM, Junwang Zhao <zhjwpku@gmail.com> wrote:
>>> > Hi, Sage,
>>> >
>>> > Sorry I did not see your reply util now, I will work on it.
>>> >
>>> > Thanks!
>>> > Zhao
>>> >
>>> > On Fri, Jun 3, 2016 at 1:46 AM, Sage Weil <sage@newdream.net> wrote:
>>> >> On Thu, 2 Jun 2016, Sage Weil wrote:
>>> >>> Hi Zhao,
>>> >>>
>>> >>> On Tue, 31 May 2016, Junwang Zhao wrote:
>>> >>> > Hi Sage & Haomai,
>>> >>> >
>>> >>> > I try to run './vstart.sh -d -x -n' to test the patches today, I got
>>> >>> > some error like 'wrong node', after some work, I found the
>>> >>> > error was introduced by this patch[0]. I change the code to
>>> >>> > use the legacy decode and encode, but got stuck somewhere.
>>> >>> >
>>> >>> > --- a/src/msg/msg_types.h
>>> >>> > +++ b/src/msg/msg_types.h
>>> >>> > @@ -371,7 +371,7 @@ struct entity_addr_t {
>>> >>> >    // broader study
>>> >>> >
>>> >>> >    void encode(bufferlist& bl, uint64_t features) const {
>>> >>> > -    if ((features & CEPH_FEATURE_MSG_ADDR2) == 0) {
>>> >>> > +    if ((features & CEPH_FEATURE_MSG_ADDR2) != 0) {
>>> >>> >        ::encode((__u32)0, bl);
>>> >>> >        ::encode(nonce, bl);
>>> >>> >        sockaddr_storage ss = get_sockaddr_storage();
>>> >>> >
>>> >>> >
>>> >>> > Can you please see the patch again and give some comments?
>>> >>> >
>>> >>> > [0]https://github.com/zhjwpku/ceph/commit/8934f038fdf28156038770e601e4a27f7a684390
>>> >>>
>>> >>> This was enough to make things work fo rme:
>>> >>>
>>> >>> diff --git a/src/msg/msg_types.h b/src/msg/msg_types.h
>>> >>> index 9c521e6..0da0614 100644
>>> >>> --- a/src/msg/msg_types.h
>>> >>> +++ b/src/msg/msg_types.h
>>> >>> @@ -199,8 +199,7 @@ WRITE_CLASS_ENCODER(ceph_sockaddr_storage)
>>> >>>
>>> >>>  struct entity_addr_t {
>>> >>>    typedef enum {
>>> >>> -    TYPE_NONE = 0,
>>> >>> -    TYPE_LEGACY = 1,
>>> >>> +    TYPE_LEGACY = 0,
>>> >>>    } type_t;
>>> >>>
>>> >>>    __u32 type;
>>> >>> @@ -211,7 +210,7 @@ struct entity_addr_t {
>>> >>>      sockaddr_in6 sin6;
>>> >>>    } u;
>>> >>>
>>> >>> -  entity_addr_t() : type(0), nonce(0) {
>>> >>> +  entity_addr_t() : type(TYPE_LEGACY), nonce(0) {
>>> >>>      memset(&u, 0, sizeof(u));
>>> >>>    }
>>> >>>    explicit entity_addr_t(const ceph_entity_addr &o) {
>>> >>>
>>> >>> but I think this is not the right fix.  I think we still want
>>> >>> TYPE_NONE, TYPE_LEGACY, and (soon) TYPE_MSGR2, and that the core problem
>>> >>> is places that are creating addrs aren't setting the type properly.  Going
>>> >>> to play with this a bit...
>>> >>
>>> >> I pushed a wip-addr-more branch:
>>> >>
>>> >>         https://github.com/liewegas/ceph/commits/wip-addr-more
>>> >>
>>> >> This fixes the parser to take type prefixes, default to legacy, and
>>> >> changes entity_addr_t() so that it renders are "-" instead of ":/0" (which
>>> >> was weird anyway).
>>> >>
>>> >> I think what we need next are some unit tests in test/test_addrs.cc that
>>> >> do things like encode a entity_addr_t with no features, decode as
>>> >> entity_addrvec_t, or construct varous addrvecs, encode with no features,
>>> >> an ensure a decode to entity_addr_t gives back the right addr from the
>>> >> list.
>>> >>
>>> >> sage
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sage Weil June 10, 2016, 5:05 p.m. UTC | #7
On Fri, 10 Jun 2016, Junwang Zhao wrote:
> Hi Sage,
> 
> Updated and pushed to another branch, see [0], the 'make check'
> shows that the test cases FAILS, could you tell me how to debug
> these unittest? I hope to see the result using "cout" but I don't know
> where those output goes to :(
> 
> [0]https://github.com/zhjwpku/ceph/commit/e7307deae01fd6b675350fdae9c493eae04fdde9

The output of any make check test is found at $test_name.log.  In this 
case, unittest_addrs.log.  But you probably just want to

 make unittest_addrs && ./unittest_addrs

sage


> 
> Thanks,
> Zhao
> 
> On Fri, Jun 10, 2016 at 10:37 AM, Junwang Zhao <zhjwpku@gmail.com> wrote:
> > Hi Sage,
> >
> > I have go a little bit further, the cases you mentioned really make sense,
> > please see [0] and make some comments, I am still not good at writting
> > the unittest.
> >
> > [0]https://github.com/zhjwpku/ceph/commit/4da16085a650f1dd0b743560836fdf213507d3b4
> >
> > Thanks!
> > Zhao
> >
> > On Fri, Jun 10, 2016 at 12:10 AM, Sage Weil <sage@newdream.net> wrote:
> >> On Wed, 8 Jun 2016, Junwang Zhao wrote:
> >>> Hi Sage,
> >>>
> >>> Write some unittest, but seems not right, see [0], give some comments
> >>> please, I use `cout` to print the value, but I don't know where to see them.
> >>>
> >>> I got a lot of other "FAIL" in the wip-addr-more branch, like:
> >>>   ceph-detect-init/run-tox.sh
> >>>   ceph-disk/run-tox.sh
> >>>   test/run-rbd-unit-tests.sh
> >>> I am wondering is this because my working environment? My laptop
> >>> is really not very powerful, "i5-3210M CPU, 8G memory" :(
> >>> make check is quite slow.
> >>>
> >>> [0] https://github.com/zhjwpku/ceph/commit/e15a240d56c2b67a8b0906dc80650a91bff1518f
> >>
> >> I'm traveling today and not able to replicate this yet--I'll take a look
> >> tomorrow.
> >>
> >> In the meantime, though, it doesn't look like your new tests are doing
> >> quite what I think we need them to.  I think the important cases are:
> >>
> >> 1- make sure that any encoded entity_addr_t can be decoded as a
> >> entity_addrvec_t, with size 1, where item 0 matches the addr.  Can use the
> >> existing arrays of addrs to parse to test this.
> >>
> >> 2- make sure that in all of those cases, if we encode the entity_addrvec_t
> >> with features 0, and decode as an entity_addr_t, we get back the same
> >> entity_addr_t.
> >>
> >> 3- the addrvec size > 1 cases:
> >>
> >> - populate an addrvec with multiple addrs where one is legacy and others
> >> are not, and ensure that encoding with features 0 and decoding as an addr
> >> gives back the legacy item.
> >>
> >> - multiple legacy addrs with features 0 should encode the first one in the
> >> list
> >>
> >> - a vec with all non-legacy addrs encoded with features 0 and decoded as
> >> addr should give back an addr that is not entity_addr_t() but is
> >> unusable/bogus.
> >>
> >> 4- the empty addrvec case:
> >>
> >> - encode with features 0 and decoding as addr should give back
> >> entity_addr_t().
> >>
> >> I think that's it?
> >>
> >> sage
> >>
> >>
> >>>
> >>> On Mon, Jun 6, 2016 at 4:28 PM, Junwang Zhao <zhjwpku@gmail.com> wrote:
> >>> > Hi, Sage,
> >>> >
> >>> > Sorry I did not see your reply util now, I will work on it.
> >>> >
> >>> > Thanks!
> >>> > Zhao
> >>> >
> >>> > On Fri, Jun 3, 2016 at 1:46 AM, Sage Weil <sage@newdream.net> wrote:
> >>> >> On Thu, 2 Jun 2016, Sage Weil wrote:
> >>> >>> Hi Zhao,
> >>> >>>
> >>> >>> On Tue, 31 May 2016, Junwang Zhao wrote:
> >>> >>> > Hi Sage & Haomai,
> >>> >>> >
> >>> >>> > I try to run './vstart.sh -d -x -n' to test the patches today, I got
> >>> >>> > some error like 'wrong node', after some work, I found the
> >>> >>> > error was introduced by this patch[0]. I change the code to
> >>> >>> > use the legacy decode and encode, but got stuck somewhere.
> >>> >>> >
> >>> >>> > --- a/src/msg/msg_types.h
> >>> >>> > +++ b/src/msg/msg_types.h
> >>> >>> > @@ -371,7 +371,7 @@ struct entity_addr_t {
> >>> >>> >    // broader study
> >>> >>> >
> >>> >>> >    void encode(bufferlist& bl, uint64_t features) const {
> >>> >>> > -    if ((features & CEPH_FEATURE_MSG_ADDR2) == 0) {
> >>> >>> > +    if ((features & CEPH_FEATURE_MSG_ADDR2) != 0) {
> >>> >>> >        ::encode((__u32)0, bl);
> >>> >>> >        ::encode(nonce, bl);
> >>> >>> >        sockaddr_storage ss = get_sockaddr_storage();
> >>> >>> >
> >>> >>> >
> >>> >>> > Can you please see the patch again and give some comments?
> >>> >>> >
> >>> >>> > [0]https://github.com/zhjwpku/ceph/commit/8934f038fdf28156038770e601e4a27f7a684390
> >>> >>>
> >>> >>> This was enough to make things work fo rme:
> >>> >>>
> >>> >>> diff --git a/src/msg/msg_types.h b/src/msg/msg_types.h
> >>> >>> index 9c521e6..0da0614 100644
> >>> >>> --- a/src/msg/msg_types.h
> >>> >>> +++ b/src/msg/msg_types.h
> >>> >>> @@ -199,8 +199,7 @@ WRITE_CLASS_ENCODER(ceph_sockaddr_storage)
> >>> >>>
> >>> >>>  struct entity_addr_t {
> >>> >>>    typedef enum {
> >>> >>> -    TYPE_NONE = 0,
> >>> >>> -    TYPE_LEGACY = 1,
> >>> >>> +    TYPE_LEGACY = 0,
> >>> >>>    } type_t;
> >>> >>>
> >>> >>>    __u32 type;
> >>> >>> @@ -211,7 +210,7 @@ struct entity_addr_t {
> >>> >>>      sockaddr_in6 sin6;
> >>> >>>    } u;
> >>> >>>
> >>> >>> -  entity_addr_t() : type(0), nonce(0) {
> >>> >>> +  entity_addr_t() : type(TYPE_LEGACY), nonce(0) {
> >>> >>>      memset(&u, 0, sizeof(u));
> >>> >>>    }
> >>> >>>    explicit entity_addr_t(const ceph_entity_addr &o) {
> >>> >>>
> >>> >>> but I think this is not the right fix.  I think we still want
> >>> >>> TYPE_NONE, TYPE_LEGACY, and (soon) TYPE_MSGR2, and that the core problem
> >>> >>> is places that are creating addrs aren't setting the type properly.  Going
> >>> >>> to play with this a bit...
> >>> >>
> >>> >> I pushed a wip-addr-more branch:
> >>> >>
> >>> >>         https://github.com/liewegas/ceph/commits/wip-addr-more
> >>> >>
> >>> >> This fixes the parser to take type prefixes, default to legacy, and
> >>> >> changes entity_addr_t() so that it renders are "-" instead of ":/0" (which
> >>> >> was weird anyway).
> >>> >>
> >>> >> I think what we need next are some unit tests in test/test_addrs.cc that
> >>> >> do things like encode a entity_addr_t with no features, decode as
> >>> >> entity_addrvec_t, or construct varous addrvecs, encode with no features,
> >>> >> an ensure a decode to entity_addr_t gives back the right addr from the
> >>> >> list.
> >>> >>
> >>> >> sage
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>
> >>>
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sage Weil June 10, 2016, 5:07 p.m. UTC | #8
On Fri, 10 Jun 2016, Junwang Zhao wrote:
> Hi Sage,
> 
> Updated and pushed to another branch, see [0], the 'make check'
> shows that the test cases FAILS, could you tell me how to debug
> these unittest? I hope to see the result using "cout" but I don't know
> where those output goes to :(
> 
> [0]https://github.com/zhjwpku/ceph/commit/e7307deae01fd6b675350fdae9c493eae04fdde9

I fixed a problem in the messengesr that was breaking the addr learning, 
and thus a bunch of the unit tests that try to run test clusters.  I also 
rebased it on master.  Pushed an updated wip-addr-more to my git tree 
(liewegas/ceph)!

sage

> 
> Thanks,
> Zhao
> 
> On Fri, Jun 10, 2016 at 10:37 AM, Junwang Zhao <zhjwpku@gmail.com> wrote:
> > Hi Sage,
> >
> > I have go a little bit further, the cases you mentioned really make sense,
> > please see [0] and make some comments, I am still not good at writting
> > the unittest.
> >
> > [0]https://github.com/zhjwpku/ceph/commit/4da16085a650f1dd0b743560836fdf213507d3b4
> >
> > Thanks!
> > Zhao
> >
> > On Fri, Jun 10, 2016 at 12:10 AM, Sage Weil <sage@newdream.net> wrote:
> >> On Wed, 8 Jun 2016, Junwang Zhao wrote:
> >>> Hi Sage,
> >>>
> >>> Write some unittest, but seems not right, see [0], give some comments
> >>> please, I use `cout` to print the value, but I don't know where to see them.
> >>>
> >>> I got a lot of other "FAIL" in the wip-addr-more branch, like:
> >>>   ceph-detect-init/run-tox.sh
> >>>   ceph-disk/run-tox.sh
> >>>   test/run-rbd-unit-tests.sh
> >>> I am wondering is this because my working environment? My laptop
> >>> is really not very powerful, "i5-3210M CPU, 8G memory" :(
> >>> make check is quite slow.
> >>>
> >>> [0] https://github.com/zhjwpku/ceph/commit/e15a240d56c2b67a8b0906dc80650a91bff1518f
> >>
> >> I'm traveling today and not able to replicate this yet--I'll take a look
> >> tomorrow.
> >>
> >> In the meantime, though, it doesn't look like your new tests are doing
> >> quite what I think we need them to.  I think the important cases are:
> >>
> >> 1- make sure that any encoded entity_addr_t can be decoded as a
> >> entity_addrvec_t, with size 1, where item 0 matches the addr.  Can use the
> >> existing arrays of addrs to parse to test this.
> >>
> >> 2- make sure that in all of those cases, if we encode the entity_addrvec_t
> >> with features 0, and decode as an entity_addr_t, we get back the same
> >> entity_addr_t.
> >>
> >> 3- the addrvec size > 1 cases:
> >>
> >> - populate an addrvec with multiple addrs where one is legacy and others
> >> are not, and ensure that encoding with features 0 and decoding as an addr
> >> gives back the legacy item.
> >>
> >> - multiple legacy addrs with features 0 should encode the first one in the
> >> list
> >>
> >> - a vec with all non-legacy addrs encoded with features 0 and decoded as
> >> addr should give back an addr that is not entity_addr_t() but is
> >> unusable/bogus.
> >>
> >> 4- the empty addrvec case:
> >>
> >> - encode with features 0 and decoding as addr should give back
> >> entity_addr_t().
> >>
> >> I think that's it?
> >>
> >> sage
> >>
> >>
> >>>
> >>> On Mon, Jun 6, 2016 at 4:28 PM, Junwang Zhao <zhjwpku@gmail.com> wrote:
> >>> > Hi, Sage,
> >>> >
> >>> > Sorry I did not see your reply util now, I will work on it.
> >>> >
> >>> > Thanks!
> >>> > Zhao
> >>> >
> >>> > On Fri, Jun 3, 2016 at 1:46 AM, Sage Weil <sage@newdream.net> wrote:
> >>> >> On Thu, 2 Jun 2016, Sage Weil wrote:
> >>> >>> Hi Zhao,
> >>> >>>
> >>> >>> On Tue, 31 May 2016, Junwang Zhao wrote:
> >>> >>> > Hi Sage & Haomai,
> >>> >>> >
> >>> >>> > I try to run './vstart.sh -d -x -n' to test the patches today, I got
> >>> >>> > some error like 'wrong node', after some work, I found the
> >>> >>> > error was introduced by this patch[0]. I change the code to
> >>> >>> > use the legacy decode and encode, but got stuck somewhere.
> >>> >>> >
> >>> >>> > --- a/src/msg/msg_types.h
> >>> >>> > +++ b/src/msg/msg_types.h
> >>> >>> > @@ -371,7 +371,7 @@ struct entity_addr_t {
> >>> >>> >    // broader study
> >>> >>> >
> >>> >>> >    void encode(bufferlist& bl, uint64_t features) const {
> >>> >>> > -    if ((features & CEPH_FEATURE_MSG_ADDR2) == 0) {
> >>> >>> > +    if ((features & CEPH_FEATURE_MSG_ADDR2) != 0) {
> >>> >>> >        ::encode((__u32)0, bl);
> >>> >>> >        ::encode(nonce, bl);
> >>> >>> >        sockaddr_storage ss = get_sockaddr_storage();
> >>> >>> >
> >>> >>> >
> >>> >>> > Can you please see the patch again and give some comments?
> >>> >>> >
> >>> >>> > [0]https://github.com/zhjwpku/ceph/commit/8934f038fdf28156038770e601e4a27f7a684390
> >>> >>>
> >>> >>> This was enough to make things work fo rme:
> >>> >>>
> >>> >>> diff --git a/src/msg/msg_types.h b/src/msg/msg_types.h
> >>> >>> index 9c521e6..0da0614 100644
> >>> >>> --- a/src/msg/msg_types.h
> >>> >>> +++ b/src/msg/msg_types.h
> >>> >>> @@ -199,8 +199,7 @@ WRITE_CLASS_ENCODER(ceph_sockaddr_storage)
> >>> >>>
> >>> >>>  struct entity_addr_t {
> >>> >>>    typedef enum {
> >>> >>> -    TYPE_NONE = 0,
> >>> >>> -    TYPE_LEGACY = 1,
> >>> >>> +    TYPE_LEGACY = 0,
> >>> >>>    } type_t;
> >>> >>>
> >>> >>>    __u32 type;
> >>> >>> @@ -211,7 +210,7 @@ struct entity_addr_t {
> >>> >>>      sockaddr_in6 sin6;
> >>> >>>    } u;
> >>> >>>
> >>> >>> -  entity_addr_t() : type(0), nonce(0) {
> >>> >>> +  entity_addr_t() : type(TYPE_LEGACY), nonce(0) {
> >>> >>>      memset(&u, 0, sizeof(u));
> >>> >>>    }
> >>> >>>    explicit entity_addr_t(const ceph_entity_addr &o) {
> >>> >>>
> >>> >>> but I think this is not the right fix.  I think we still want
> >>> >>> TYPE_NONE, TYPE_LEGACY, and (soon) TYPE_MSGR2, and that the core problem
> >>> >>> is places that are creating addrs aren't setting the type properly.  Going
> >>> >>> to play with this a bit...
> >>> >>
> >>> >> I pushed a wip-addr-more branch:
> >>> >>
> >>> >>         https://github.com/liewegas/ceph/commits/wip-addr-more
> >>> >>
> >>> >> This fixes the parser to take type prefixes, default to legacy, and
> >>> >> changes entity_addr_t() so that it renders are "-" instead of ":/0" (which
> >>> >> was weird anyway).
> >>> >>
> >>> >> I think what we need next are some unit tests in test/test_addrs.cc that
> >>> >> do things like encode a entity_addr_t with no features, decode as
> >>> >> entity_addrvec_t, or construct varous addrvecs, encode with no features,
> >>> >> an ensure a decode to entity_addr_t gives back the right addr from the
> >>> >> list.
> >>> >>
> >>> >> sage
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>
> >>>
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Hunter June 11, 2016, 9:29 a.m. UTC | #9
Hi Sage,

I have modified the patch to make it PASS the unittest, you may want
to have a look and cherry-pick the updated commit[0].

Thanks,
Zhao

[0] https://github.com/zhjwpku/ceph/commit/b1f1183e3e2f39bbdbfa326d0dfc6f9b95611321

On Sat, Jun 11, 2016 at 1:07 AM, Sage Weil <sage@newdream.net> wrote:
> On Fri, 10 Jun 2016, Junwang Zhao wrote:
>> Hi Sage,
>>
>> Updated and pushed to another branch, see [0], the 'make check'
>> shows that the test cases FAILS, could you tell me how to debug
>> these unittest? I hope to see the result using "cout" but I don't know
>> where those output goes to :(
>>
>> [0]https://github.com/zhjwpku/ceph/commit/e7307deae01fd6b675350fdae9c493eae04fdde9
>
> I fixed a problem in the messengesr that was breaking the addr learning,
> and thus a bunch of the unit tests that try to run test clusters.  I also
> rebased it on master.  Pushed an updated wip-addr-more to my git tree
> (liewegas/ceph)!
>
> sage
>
>>
>> Thanks,
>> Zhao
>>
>> On Fri, Jun 10, 2016 at 10:37 AM, Junwang Zhao <zhjwpku@gmail.com> wrote:
>> > Hi Sage,
>> >
>> > I have go a little bit further, the cases you mentioned really make sense,
>> > please see [0] and make some comments, I am still not good at writting
>> > the unittest.
>> >
>> > [0]https://github.com/zhjwpku/ceph/commit/4da16085a650f1dd0b743560836fdf213507d3b4
>> >
>> > Thanks!
>> > Zhao
>> >
>> > On Fri, Jun 10, 2016 at 12:10 AM, Sage Weil <sage@newdream.net> wrote:
>> >> On Wed, 8 Jun 2016, Junwang Zhao wrote:
>> >>> Hi Sage,
>> >>>
>> >>> Write some unittest, but seems not right, see [0], give some comments
>> >>> please, I use `cout` to print the value, but I don't know where to see them.
>> >>>
>> >>> I got a lot of other "FAIL" in the wip-addr-more branch, like:
>> >>>   ceph-detect-init/run-tox.sh
>> >>>   ceph-disk/run-tox.sh
>> >>>   test/run-rbd-unit-tests.sh
>> >>> I am wondering is this because my working environment? My laptop
>> >>> is really not very powerful, "i5-3210M CPU, 8G memory" :(
>> >>> make check is quite slow.
>> >>>
>> >>> [0] https://github.com/zhjwpku/ceph/commit/e15a240d56c2b67a8b0906dc80650a91bff1518f
>> >>
>> >> I'm traveling today and not able to replicate this yet--I'll take a look
>> >> tomorrow.
>> >>
>> >> In the meantime, though, it doesn't look like your new tests are doing
>> >> quite what I think we need them to.  I think the important cases are:
>> >>
>> >> 1- make sure that any encoded entity_addr_t can be decoded as a
>> >> entity_addrvec_t, with size 1, where item 0 matches the addr.  Can use the
>> >> existing arrays of addrs to parse to test this.
>> >>
>> >> 2- make sure that in all of those cases, if we encode the entity_addrvec_t
>> >> with features 0, and decode as an entity_addr_t, we get back the same
>> >> entity_addr_t.
>> >>
>> >> 3- the addrvec size > 1 cases:
>> >>
>> >> - populate an addrvec with multiple addrs where one is legacy and others
>> >> are not, and ensure that encoding with features 0 and decoding as an addr
>> >> gives back the legacy item.
>> >>
>> >> - multiple legacy addrs with features 0 should encode the first one in the
>> >> list
>> >>
>> >> - a vec with all non-legacy addrs encoded with features 0 and decoded as
>> >> addr should give back an addr that is not entity_addr_t() but is
>> >> unusable/bogus.
>> >>
>> >> 4- the empty addrvec case:
>> >>
>> >> - encode with features 0 and decoding as addr should give back
>> >> entity_addr_t().
>> >>
>> >> I think that's it?
>> >>
>> >> sage
>> >>
>> >>
>> >>>
>> >>> On Mon, Jun 6, 2016 at 4:28 PM, Junwang Zhao <zhjwpku@gmail.com> wrote:
>> >>> > Hi, Sage,
>> >>> >
>> >>> > Sorry I did not see your reply util now, I will work on it.
>> >>> >
>> >>> > Thanks!
>> >>> > Zhao
>> >>> >
>> >>> > On Fri, Jun 3, 2016 at 1:46 AM, Sage Weil <sage@newdream.net> wrote:
>> >>> >> On Thu, 2 Jun 2016, Sage Weil wrote:
>> >>> >>> Hi Zhao,
>> >>> >>>
>> >>> >>> On Tue, 31 May 2016, Junwang Zhao wrote:
>> >>> >>> > Hi Sage & Haomai,
>> >>> >>> >
>> >>> >>> > I try to run './vstart.sh -d -x -n' to test the patches today, I got
>> >>> >>> > some error like 'wrong node', after some work, I found the
>> >>> >>> > error was introduced by this patch[0]. I change the code to
>> >>> >>> > use the legacy decode and encode, but got stuck somewhere.
>> >>> >>> >
>> >>> >>> > --- a/src/msg/msg_types.h
>> >>> >>> > +++ b/src/msg/msg_types.h
>> >>> >>> > @@ -371,7 +371,7 @@ struct entity_addr_t {
>> >>> >>> >    // broader study
>> >>> >>> >
>> >>> >>> >    void encode(bufferlist& bl, uint64_t features) const {
>> >>> >>> > -    if ((features & CEPH_FEATURE_MSG_ADDR2) == 0) {
>> >>> >>> > +    if ((features & CEPH_FEATURE_MSG_ADDR2) != 0) {
>> >>> >>> >        ::encode((__u32)0, bl);
>> >>> >>> >        ::encode(nonce, bl);
>> >>> >>> >        sockaddr_storage ss = get_sockaddr_storage();
>> >>> >>> >
>> >>> >>> >
>> >>> >>> > Can you please see the patch again and give some comments?
>> >>> >>> >
>> >>> >>> > [0]https://github.com/zhjwpku/ceph/commit/8934f038fdf28156038770e601e4a27f7a684390
>> >>> >>>
>> >>> >>> This was enough to make things work fo rme:
>> >>> >>>
>> >>> >>> diff --git a/src/msg/msg_types.h b/src/msg/msg_types.h
>> >>> >>> index 9c521e6..0da0614 100644
>> >>> >>> --- a/src/msg/msg_types.h
>> >>> >>> +++ b/src/msg/msg_types.h
>> >>> >>> @@ -199,8 +199,7 @@ WRITE_CLASS_ENCODER(ceph_sockaddr_storage)
>> >>> >>>
>> >>> >>>  struct entity_addr_t {
>> >>> >>>    typedef enum {
>> >>> >>> -    TYPE_NONE = 0,
>> >>> >>> -    TYPE_LEGACY = 1,
>> >>> >>> +    TYPE_LEGACY = 0,
>> >>> >>>    } type_t;
>> >>> >>>
>> >>> >>>    __u32 type;
>> >>> >>> @@ -211,7 +210,7 @@ struct entity_addr_t {
>> >>> >>>      sockaddr_in6 sin6;
>> >>> >>>    } u;
>> >>> >>>
>> >>> >>> -  entity_addr_t() : type(0), nonce(0) {
>> >>> >>> +  entity_addr_t() : type(TYPE_LEGACY), nonce(0) {
>> >>> >>>      memset(&u, 0, sizeof(u));
>> >>> >>>    }
>> >>> >>>    explicit entity_addr_t(const ceph_entity_addr &o) {
>> >>> >>>
>> >>> >>> but I think this is not the right fix.  I think we still want
>> >>> >>> TYPE_NONE, TYPE_LEGACY, and (soon) TYPE_MSGR2, and that the core problem
>> >>> >>> is places that are creating addrs aren't setting the type properly.  Going
>> >>> >>> to play with this a bit...
>> >>> >>
>> >>> >> I pushed a wip-addr-more branch:
>> >>> >>
>> >>> >>         https://github.com/liewegas/ceph/commits/wip-addr-more
>> >>> >>
>> >>> >> This fixes the parser to take type prefixes, default to legacy, and
>> >>> >> changes entity_addr_t() so that it renders are "-" instead of ":/0" (which
>> >>> >> was weird anyway).
>> >>> >>
>> >>> >> I think what we need next are some unit tests in test/test_addrs.cc that
>> >>> >> do things like encode a entity_addr_t with no features, decode as
>> >>> >> entity_addrvec_t, or construct varous addrvecs, encode with no features,
>> >>> >> an ensure a decode to entity_addr_t gives back the right addr from the
>> >>> >> list.
>> >>> >>
>> >>> >> sage
>> >>> --
>> >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> >>> the body of a message to majordomo@vger.kernel.org
>> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >>>
>> >>>
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sage Weil June 15, 2016, 9:17 p.m. UTC | #10
I've rebased this and pushed a new wip-addrvec branch, PR

	https://github.com/ceph/ceph/pull/9726

I wonder if a good way to ease into using this would be to make 
AsyncMessenger allow binding to multiple addresses (IPv4 and IPv6).  

Haomai, what do you think?

That will help us fix the multiple-address issues in msg/async (for 
example, announcing ourselves using the correct addr depending on which 
socket we accepted the connection on).  That will be one less thing in the 
way to making msg/async speak msgr2 on some connections.

I started working on another branch that cleans up and refactors some of 
the authentication callbacks from the msgr into osd/mds/mon code so that 
we can accomodate both msgr1 and msgr2 authentication.

	https://github.com/ceph/ceph/pull/9727

A bit more cleanup still to do there, but soon I think we'll be to the 
point where we need the msg/async half that implements the protocol so we 
can wire the two together.

sage


On Sat, 11 Jun 2016, Junwang Zhao wrote:

> Hi Sage,
> 
> I have modified the patch to make it PASS the unittest, you may want
> to have a look and cherry-pick the updated commit[0].
> 
> Thanks,
> Zhao
> 
> [0] https://github.com/zhjwpku/ceph/commit/b1f1183e3e2f39bbdbfa326d0dfc6f9b95611321
> 
> On Sat, Jun 11, 2016 at 1:07 AM, Sage Weil <sage@newdream.net> wrote:
> > On Fri, 10 Jun 2016, Junwang Zhao wrote:
> >> Hi Sage,
> >>
> >> Updated and pushed to another branch, see [0], the 'make check'
> >> shows that the test cases FAILS, could you tell me how to debug
> >> these unittest? I hope to see the result using "cout" but I don't know
> >> where those output goes to :(
> >>
> >> [0]https://github.com/zhjwpku/ceph/commit/e7307deae01fd6b675350fdae9c493eae04fdde9
> >
> > I fixed a problem in the messengesr that was breaking the addr learning,
> > and thus a bunch of the unit tests that try to run test clusters.  I also
> > rebased it on master.  Pushed an updated wip-addr-more to my git tree
> > (liewegas/ceph)!
> >
> > sage
> >
> >>
> >> Thanks,
> >> Zhao
> >>
> >> On Fri, Jun 10, 2016 at 10:37 AM, Junwang Zhao <zhjwpku@gmail.com> wrote:
> >> > Hi Sage,
> >> >
> >> > I have go a little bit further, the cases you mentioned really make sense,
> >> > please see [0] and make some comments, I am still not good at writting
> >> > the unittest.
> >> >
> >> > [0]https://github.com/zhjwpku/ceph/commit/4da16085a650f1dd0b743560836fdf213507d3b4
> >> >
> >> > Thanks!
> >> > Zhao
> >> >
> >> > On Fri, Jun 10, 2016 at 12:10 AM, Sage Weil <sage@newdream.net> wrote:
> >> >> On Wed, 8 Jun 2016, Junwang Zhao wrote:
> >> >>> Hi Sage,
> >> >>>
> >> >>> Write some unittest, but seems not right, see [0], give some comments
> >> >>> please, I use `cout` to print the value, but I don't know where to see them.
> >> >>>
> >> >>> I got a lot of other "FAIL" in the wip-addr-more branch, like:
> >> >>>   ceph-detect-init/run-tox.sh
> >> >>>   ceph-disk/run-tox.sh
> >> >>>   test/run-rbd-unit-tests.sh
> >> >>> I am wondering is this because my working environment? My laptop
> >> >>> is really not very powerful, "i5-3210M CPU, 8G memory" :(
> >> >>> make check is quite slow.
> >> >>>
> >> >>> [0] https://github.com/zhjwpku/ceph/commit/e15a240d56c2b67a8b0906dc80650a91bff1518f
> >> >>
> >> >> I'm traveling today and not able to replicate this yet--I'll take a look
> >> >> tomorrow.
> >> >>
> >> >> In the meantime, though, it doesn't look like your new tests are doing
> >> >> quite what I think we need them to.  I think the important cases are:
> >> >>
> >> >> 1- make sure that any encoded entity_addr_t can be decoded as a
> >> >> entity_addrvec_t, with size 1, where item 0 matches the addr.  Can use the
> >> >> existing arrays of addrs to parse to test this.
> >> >>
> >> >> 2- make sure that in all of those cases, if we encode the entity_addrvec_t
> >> >> with features 0, and decode as an entity_addr_t, we get back the same
> >> >> entity_addr_t.
> >> >>
> >> >> 3- the addrvec size > 1 cases:
> >> >>
> >> >> - populate an addrvec with multiple addrs where one is legacy and others
> >> >> are not, and ensure that encoding with features 0 and decoding as an addr
> >> >> gives back the legacy item.
> >> >>
> >> >> - multiple legacy addrs with features 0 should encode the first one in the
> >> >> list
> >> >>
> >> >> - a vec with all non-legacy addrs encoded with features 0 and decoded as
> >> >> addr should give back an addr that is not entity_addr_t() but is
> >> >> unusable/bogus.
> >> >>
> >> >> 4- the empty addrvec case:
> >> >>
> >> >> - encode with features 0 and decoding as addr should give back
> >> >> entity_addr_t().
> >> >>
> >> >> I think that's it?
> >> >>
> >> >> sage
> >> >>
> >> >>
> >> >>>
> >> >>> On Mon, Jun 6, 2016 at 4:28 PM, Junwang Zhao <zhjwpku@gmail.com> wrote:
> >> >>> > Hi, Sage,
> >> >>> >
> >> >>> > Sorry I did not see your reply util now, I will work on it.
> >> >>> >
> >> >>> > Thanks!
> >> >>> > Zhao
> >> >>> >
> >> >>> > On Fri, Jun 3, 2016 at 1:46 AM, Sage Weil <sage@newdream.net> wrote:
> >> >>> >> On Thu, 2 Jun 2016, Sage Weil wrote:
> >> >>> >>> Hi Zhao,
> >> >>> >>>
> >> >>> >>> On Tue, 31 May 2016, Junwang Zhao wrote:
> >> >>> >>> > Hi Sage & Haomai,
> >> >>> >>> >
> >> >>> >>> > I try to run './vstart.sh -d -x -n' to test the patches today, I got
> >> >>> >>> > some error like 'wrong node', after some work, I found the
> >> >>> >>> > error was introduced by this patch[0]. I change the code to
> >> >>> >>> > use the legacy decode and encode, but got stuck somewhere.
> >> >>> >>> >
> >> >>> >>> > --- a/src/msg/msg_types.h
> >> >>> >>> > +++ b/src/msg/msg_types.h
> >> >>> >>> > @@ -371,7 +371,7 @@ struct entity_addr_t {
> >> >>> >>> >    // broader study
> >> >>> >>> >
> >> >>> >>> >    void encode(bufferlist& bl, uint64_t features) const {
> >> >>> >>> > -    if ((features & CEPH_FEATURE_MSG_ADDR2) == 0) {
> >> >>> >>> > +    if ((features & CEPH_FEATURE_MSG_ADDR2) != 0) {
> >> >>> >>> >        ::encode((__u32)0, bl);
> >> >>> >>> >        ::encode(nonce, bl);
> >> >>> >>> >        sockaddr_storage ss = get_sockaddr_storage();
> >> >>> >>> >
> >> >>> >>> >
> >> >>> >>> > Can you please see the patch again and give some comments?
> >> >>> >>> >
> >> >>> >>> > [0]https://github.com/zhjwpku/ceph/commit/8934f038fdf28156038770e601e4a27f7a684390
> >> >>> >>>
> >> >>> >>> This was enough to make things work fo rme:
> >> >>> >>>
> >> >>> >>> diff --git a/src/msg/msg_types.h b/src/msg/msg_types.h
> >> >>> >>> index 9c521e6..0da0614 100644
> >> >>> >>> --- a/src/msg/msg_types.h
> >> >>> >>> +++ b/src/msg/msg_types.h
> >> >>> >>> @@ -199,8 +199,7 @@ WRITE_CLASS_ENCODER(ceph_sockaddr_storage)
> >> >>> >>>
> >> >>> >>>  struct entity_addr_t {
> >> >>> >>>    typedef enum {
> >> >>> >>> -    TYPE_NONE = 0,
> >> >>> >>> -    TYPE_LEGACY = 1,
> >> >>> >>> +    TYPE_LEGACY = 0,
> >> >>> >>>    } type_t;
> >> >>> >>>
> >> >>> >>>    __u32 type;
> >> >>> >>> @@ -211,7 +210,7 @@ struct entity_addr_t {
> >> >>> >>>      sockaddr_in6 sin6;
> >> >>> >>>    } u;
> >> >>> >>>
> >> >>> >>> -  entity_addr_t() : type(0), nonce(0) {
> >> >>> >>> +  entity_addr_t() : type(TYPE_LEGACY), nonce(0) {
> >> >>> >>>      memset(&u, 0, sizeof(u));
> >> >>> >>>    }
> >> >>> >>>    explicit entity_addr_t(const ceph_entity_addr &o) {
> >> >>> >>>
> >> >>> >>> but I think this is not the right fix.  I think we still want
> >> >>> >>> TYPE_NONE, TYPE_LEGACY, and (soon) TYPE_MSGR2, and that the core problem
> >> >>> >>> is places that are creating addrs aren't setting the type properly.  Going
> >> >>> >>> to play with this a bit...
> >> >>> >>
> >> >>> >> I pushed a wip-addr-more branch:
> >> >>> >>
> >> >>> >>         https://github.com/liewegas/ceph/commits/wip-addr-more
> >> >>> >>
> >> >>> >> This fixes the parser to take type prefixes, default to legacy, and
> >> >>> >> changes entity_addr_t() so that it renders are "-" instead of ":/0" (which
> >> >>> >> was weird anyway).
> >> >>> >>
> >> >>> >> I think what we need next are some unit tests in test/test_addrs.cc that
> >> >>> >> do things like encode a entity_addr_t with no features, decode as
> >> >>> >> entity_addrvec_t, or construct varous addrvecs, encode with no features,
> >> >>> >> an ensure a decode to entity_addr_t gives back the right addr from the
> >> >>> >> list.
> >> >>> >>
> >> >>> >> sage
> >> >>> --
> >> >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> >>> the body of a message to majordomo@vger.kernel.org
> >> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> >>>
> >> >>>
> >>
> >>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Haomai Wang June 16, 2016, 1:04 a.m. UTC | #11
On Thu, Jun 16, 2016 at 5:17 AM, Sage Weil <sage@newdream.net> wrote:
>
> I've rebased this and pushed a new wip-addrvec branch, PR
>
>         https://github.com/ceph/ceph/pull/9726
>
> I wonder if a good way to ease into using this would be to make
> AsyncMessenger allow binding to multiple addresses (IPv4 and IPv6).
>
> Haomai, what do you think?
>
> That will help us fix the multiple-address issues in msg/async (for
> example, announcing ourselves using the correct addr depending on which
> socket we accepted the connection on).  That will be one less thing in the
> way to making msg/async speak msgr2 on some connections.


Hmm, you mean in multi osd instance example, we need to make each osd
instance own a bind address?

What's the problem if they use the same address, I guess the reason
should be we have existing codes rely on this independent address
potentially.

>
>
> I started working on another branch that cleans up and refactors some of
> the authentication callbacks from the msgr into osd/mds/mon code so that
> we can accomodate both msgr1 and msgr2 authentication.
>
>         https://github.com/ceph/ceph/pull/9727
>
> A bit more cleanup still to do there, but soon I think we'll be to the
> point where we need the msg/async half that implements the protocol so we
> can wire the two together.


ok...

>
>
> sage
>
>
> On Sat, 11 Jun 2016, Junwang Zhao wrote:
>
> > Hi Sage,
> >
> > I have modified the patch to make it PASS the unittest, you may want
> > to have a look and cherry-pick the updated commit[0].
> >
> > Thanks,
> > Zhao
> >
> > [0] https://github.com/zhjwpku/ceph/commit/b1f1183e3e2f39bbdbfa326d0dfc6f9b95611321
> >
> > On Sat, Jun 11, 2016 at 1:07 AM, Sage Weil <sage@newdream.net> wrote:
> > > On Fri, 10 Jun 2016, Junwang Zhao wrote:
> > >> Hi Sage,
> > >>
> > >> Updated and pushed to another branch, see [0], the 'make check'
> > >> shows that the test cases FAILS, could you tell me how to debug
> > >> these unittest? I hope to see the result using "cout" but I don't know
> > >> where those output goes to :(
> > >>
> > >> [0]https://github.com/zhjwpku/ceph/commit/e7307deae01fd6b675350fdae9c493eae04fdde9
> > >
> > > I fixed a problem in the messengesr that was breaking the addr learning,
> > > and thus a bunch of the unit tests that try to run test clusters.  I also
> > > rebased it on master.  Pushed an updated wip-addr-more to my git tree
> > > (liewegas/ceph)!
> > >
> > > sage
> > >
> > >>
> > >> Thanks,
> > >> Zhao
> > >>
> > >> On Fri, Jun 10, 2016 at 10:37 AM, Junwang Zhao <zhjwpku@gmail.com> wrote:
> > >> > Hi Sage,
> > >> >
> > >> > I have go a little bit further, the cases you mentioned really make sense,
> > >> > please see [0] and make some comments, I am still not good at writting
> > >> > the unittest.
> > >> >
> > >> > [0]https://github.com/zhjwpku/ceph/commit/4da16085a650f1dd0b743560836fdf213507d3b4
> > >> >
> > >> > Thanks!
> > >> > Zhao
> > >> >
> > >> > On Fri, Jun 10, 2016 at 12:10 AM, Sage Weil <sage@newdream.net> wrote:
> > >> >> On Wed, 8 Jun 2016, Junwang Zhao wrote:
> > >> >>> Hi Sage,
> > >> >>>
> > >> >>> Write some unittest, but seems not right, see [0], give some comments
> > >> >>> please, I use `cout` to print the value, but I don't know where to see them.
> > >> >>>
> > >> >>> I got a lot of other "FAIL" in the wip-addr-more branch, like:
> > >> >>>   ceph-detect-init/run-tox.sh
> > >> >>>   ceph-disk/run-tox.sh
> > >> >>>   test/run-rbd-unit-tests.sh
> > >> >>> I am wondering is this because my working environment? My laptop
> > >> >>> is really not very powerful, "i5-3210M CPU, 8G memory" :(
> > >> >>> make check is quite slow.
> > >> >>>
> > >> >>> [0] https://github.com/zhjwpku/ceph/commit/e15a240d56c2b67a8b0906dc80650a91bff1518f
> > >> >>
> > >> >> I'm traveling today and not able to replicate this yet--I'll take a look
> > >> >> tomorrow.
> > >> >>
> > >> >> In the meantime, though, it doesn't look like your new tests are doing
> > >> >> quite what I think we need them to.  I think the important cases are:
> > >> >>
> > >> >> 1- make sure that any encoded entity_addr_t can be decoded as a
> > >> >> entity_addrvec_t, with size 1, where item 0 matches the addr.  Can use the
> > >> >> existing arrays of addrs to parse to test this.
> > >> >>
> > >> >> 2- make sure that in all of those cases, if we encode the entity_addrvec_t
> > >> >> with features 0, and decode as an entity_addr_t, we get back the same
> > >> >> entity_addr_t.
> > >> >>
> > >> >> 3- the addrvec size > 1 cases:
> > >> >>
> > >> >> - populate an addrvec with multiple addrs where one is legacy and others
> > >> >> are not, and ensure that encoding with features 0 and decoding as an addr
> > >> >> gives back the legacy item.
> > >> >>
> > >> >> - multiple legacy addrs with features 0 should encode the first one in the
> > >> >> list
> > >> >>
> > >> >> - a vec with all non-legacy addrs encoded with features 0 and decoded as
> > >> >> addr should give back an addr that is not entity_addr_t() but is
> > >> >> unusable/bogus.
> > >> >>
> > >> >> 4- the empty addrvec case:
> > >> >>
> > >> >> - encode with features 0 and decoding as addr should give back
> > >> >> entity_addr_t().
> > >> >>
> > >> >> I think that's it?
> > >> >>
> > >> >> sage
> > >> >>
> > >> >>
> > >> >>>
> > >> >>> On Mon, Jun 6, 2016 at 4:28 PM, Junwang Zhao <zhjwpku@gmail.com> wrote:
> > >> >>> > Hi, Sage,
> > >> >>> >
> > >> >>> > Sorry I did not see your reply util now, I will work on it.
> > >> >>> >
> > >> >>> > Thanks!
> > >> >>> > Zhao
> > >> >>> >
> > >> >>> > On Fri, Jun 3, 2016 at 1:46 AM, Sage Weil <sage@newdream.net> wrote:
> > >> >>> >> On Thu, 2 Jun 2016, Sage Weil wrote:
> > >> >>> >>> Hi Zhao,
> > >> >>> >>>
> > >> >>> >>> On Tue, 31 May 2016, Junwang Zhao wrote:
> > >> >>> >>> > Hi Sage & Haomai,
> > >> >>> >>> >
> > >> >>> >>> > I try to run './vstart.sh -d -x -n' to test the patches today, I got
> > >> >>> >>> > some error like 'wrong node', after some work, I found the
> > >> >>> >>> > error was introduced by this patch[0]. I change the code to
> > >> >>> >>> > use the legacy decode and encode, but got stuck somewhere.
> > >> >>> >>> >
> > >> >>> >>> > --- a/src/msg/msg_types.h
> > >> >>> >>> > +++ b/src/msg/msg_types.h
> > >> >>> >>> > @@ -371,7 +371,7 @@ struct entity_addr_t {
> > >> >>> >>> >    // broader study
> > >> >>> >>> >
> > >> >>> >>> >    void encode(bufferlist& bl, uint64_t features) const {
> > >> >>> >>> > -    if ((features & CEPH_FEATURE_MSG_ADDR2) == 0) {
> > >> >>> >>> > +    if ((features & CEPH_FEATURE_MSG_ADDR2) != 0) {
> > >> >>> >>> >        ::encode((__u32)0, bl);
> > >> >>> >>> >        ::encode(nonce, bl);
> > >> >>> >>> >        sockaddr_storage ss = get_sockaddr_storage();
> > >> >>> >>> >
> > >> >>> >>> >
> > >> >>> >>> > Can you please see the patch again and give some comments?
> > >> >>> >>> >
> > >> >>> >>> > [0]https://github.com/zhjwpku/ceph/commit/8934f038fdf28156038770e601e4a27f7a684390
> > >> >>> >>>
> > >> >>> >>> This was enough to make things work fo rme:
> > >> >>> >>>
> > >> >>> >>> diff --git a/src/msg/msg_types.h b/src/msg/msg_types.h
> > >> >>> >>> index 9c521e6..0da0614 100644
> > >> >>> >>> --- a/src/msg/msg_types.h
> > >> >>> >>> +++ b/src/msg/msg_types.h
> > >> >>> >>> @@ -199,8 +199,7 @@ WRITE_CLASS_ENCODER(ceph_sockaddr_storage)
> > >> >>> >>>
> > >> >>> >>>  struct entity_addr_t {
> > >> >>> >>>    typedef enum {
> > >> >>> >>> -    TYPE_NONE = 0,
> > >> >>> >>> -    TYPE_LEGACY = 1,
> > >> >>> >>> +    TYPE_LEGACY = 0,
> > >> >>> >>>    } type_t;
> > >> >>> >>>
> > >> >>> >>>    __u32 type;
> > >> >>> >>> @@ -211,7 +210,7 @@ struct entity_addr_t {
> > >> >>> >>>      sockaddr_in6 sin6;
> > >> >>> >>>    } u;
> > >> >>> >>>
> > >> >>> >>> -  entity_addr_t() : type(0), nonce(0) {
> > >> >>> >>> +  entity_addr_t() : type(TYPE_LEGACY), nonce(0) {
> > >> >>> >>>      memset(&u, 0, sizeof(u));
> > >> >>> >>>    }
> > >> >>> >>>    explicit entity_addr_t(const ceph_entity_addr &o) {
> > >> >>> >>>
> > >> >>> >>> but I think this is not the right fix.  I think we still want
> > >> >>> >>> TYPE_NONE, TYPE_LEGACY, and (soon) TYPE_MSGR2, and that the core problem
> > >> >>> >>> is places that are creating addrs aren't setting the type properly.  Going
> > >> >>> >>> to play with this a bit...
> > >> >>> >>
> > >> >>> >> I pushed a wip-addr-more branch:
> > >> >>> >>
> > >> >>> >>         https://github.com/liewegas/ceph/commits/wip-addr-more
> > >> >>> >>
> > >> >>> >> This fixes the parser to take type prefixes, default to legacy, and
> > >> >>> >> changes entity_addr_t() so that it renders are "-" instead of ":/0" (which
> > >> >>> >> was weird anyway).
> > >> >>> >>
> > >> >>> >> I think what we need next are some unit tests in test/test_addrs.cc that
> > >> >>> >> do things like encode a entity_addr_t with no features, decode as
> > >> >>> >> entity_addrvec_t, or construct varous addrvecs, encode with no features,
> > >> >>> >> an ensure a decode to entity_addr_t gives back the right addr from the
> > >> >>> >> list.
> > >> >>> >>
> > >> >>> >> sage
> > >> >>> --
> > >> >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > >> >>> the body of a message to majordomo@vger.kernel.org
> > >> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > >> >>>
> > >> >>>
> > >>
> > >>
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> >
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sage Weil June 20, 2016, 5:47 p.m. UTC | #12
On Thu, 16 Jun 2016, Haomai Wang wrote:
> On Thu, Jun 16, 2016 at 5:17 AM, Sage Weil <sage@newdream.net> wrote:
> >
> > I've rebased this and pushed a new wip-addrvec branch, PR
> >
> >         https://github.com/ceph/ceph/pull/9726
> >
> > I wonder if a good way to ease into using this would be to make
> > AsyncMessenger allow binding to multiple addresses (IPv4 and IPv6).
> >
> > Haomai, what do you think?
> >
> > That will help us fix the multiple-address issues in msg/async (for
> > example, announcing ourselves using the correct addr depending on which
> > socket we accepted the connection on).  That will be one less thing in the
> > way to making msg/async speak msgr2 on some connections.
> 
> Hmm, you mean in multi osd instance example, we need to make each osd
> instance own a bind address?
> 
> What's the problem if they use the same address, I guess the reason
> should be we have existing codes rely on this independent address
> potentially.

We assume the addrs are unique (hence the nonce) all over the place.

I wasn't thinking about the many-osds-in-one-process case, though.  A 
simpler step is to make a single process, single osd bind to both ipv4 and 
ipv6 addresses, and advertise both in the OSDMap.  (Then later we'll 
extent that to also have the legacy and msgr2 addrs.)

So msg/async will need to bind to multiple addrs, revise the get_myaddr 
interface to return a vector of addresses, and put that in the OSDMap.  
And when accepting a connection, we mark the Connection/Pipe to indicate 
which socket it connected on, and adjust the addr we advertise with 
the initial handshake accordingly.

I suspect there is a bunch of annoying cleanup around all things 
myaddr...

sage


> 
> >
> >
> > I started working on another branch that cleans up and refactors some of
> > the authentication callbacks from the msgr into osd/mds/mon code so that
> > we can accomodate both msgr1 and msgr2 authentication.
> >
> >         https://github.com/ceph/ceph/pull/9727
> >
> > A bit more cleanup still to do there, but soon I think we'll be to the
> > point where we need the msg/async half that implements the protocol so we
> > can wire the two together.
> 
> 
> ok...
> 
> >
> >
> > sage
> >
> >
> > On Sat, 11 Jun 2016, Junwang Zhao wrote:
> >
> > > Hi Sage,
> > >
> > > I have modified the patch to make it PASS the unittest, you may want
> > > to have a look and cherry-pick the updated commit[0].
> > >
> > > Thanks,
> > > Zhao
> > >
> > > [0] https://github.com/zhjwpku/ceph/commit/b1f1183e3e2f39bbdbfa326d0dfc6f9b95611321
> > >
> > > On Sat, Jun 11, 2016 at 1:07 AM, Sage Weil <sage@newdream.net> wrote:
> > > > On Fri, 10 Jun 2016, Junwang Zhao wrote:
> > > >> Hi Sage,
> > > >>
> > > >> Updated and pushed to another branch, see [0], the 'make check'
> > > >> shows that the test cases FAILS, could you tell me how to debug
> > > >> these unittest? I hope to see the result using "cout" but I don't know
> > > >> where those output goes to :(
> > > >>
> > > >> [0]https://github.com/zhjwpku/ceph/commit/e7307deae01fd6b675350fdae9c493eae04fdde9
> > > >
> > > > I fixed a problem in the messengesr that was breaking the addr learning,
> > > > and thus a bunch of the unit tests that try to run test clusters.  I also
> > > > rebased it on master.  Pushed an updated wip-addr-more to my git tree
> > > > (liewegas/ceph)!
> > > >
> > > > sage
> > > >
> > > >>
> > > >> Thanks,
> > > >> Zhao
> > > >>
> > > >> On Fri, Jun 10, 2016 at 10:37 AM, Junwang Zhao <zhjwpku@gmail.com> wrote:
> > > >> > Hi Sage,
> > > >> >
> > > >> > I have go a little bit further, the cases you mentioned really make sense,
> > > >> > please see [0] and make some comments, I am still not good at writting
> > > >> > the unittest.
> > > >> >
> > > >> > [0]https://github.com/zhjwpku/ceph/commit/4da16085a650f1dd0b743560836fdf213507d3b4
> > > >> >
> > > >> > Thanks!
> > > >> > Zhao
> > > >> >
> > > >> > On Fri, Jun 10, 2016 at 12:10 AM, Sage Weil <sage@newdream.net> wrote:
> > > >> >> On Wed, 8 Jun 2016, Junwang Zhao wrote:
> > > >> >>> Hi Sage,
> > > >> >>>
> > > >> >>> Write some unittest, but seems not right, see [0], give some comments
> > > >> >>> please, I use `cout` to print the value, but I don't know where to see them.
> > > >> >>>
> > > >> >>> I got a lot of other "FAIL" in the wip-addr-more branch, like:
> > > >> >>>   ceph-detect-init/run-tox.sh
> > > >> >>>   ceph-disk/run-tox.sh
> > > >> >>>   test/run-rbd-unit-tests.sh
> > > >> >>> I am wondering is this because my working environment? My laptop
> > > >> >>> is really not very powerful, "i5-3210M CPU, 8G memory" :(
> > > >> >>> make check is quite slow.
> > > >> >>>
> > > >> >>> [0] https://github.com/zhjwpku/ceph/commit/e15a240d56c2b67a8b0906dc80650a91bff1518f
> > > >> >>
> > > >> >> I'm traveling today and not able to replicate this yet--I'll take a look
> > > >> >> tomorrow.
> > > >> >>
> > > >> >> In the meantime, though, it doesn't look like your new tests are doing
> > > >> >> quite what I think we need them to.  I think the important cases are:
> > > >> >>
> > > >> >> 1- make sure that any encoded entity_addr_t can be decoded as a
> > > >> >> entity_addrvec_t, with size 1, where item 0 matches the addr.  Can use the
> > > >> >> existing arrays of addrs to parse to test this.
> > > >> >>
> > > >> >> 2- make sure that in all of those cases, if we encode the entity_addrvec_t
> > > >> >> with features 0, and decode as an entity_addr_t, we get back the same
> > > >> >> entity_addr_t.
> > > >> >>
> > > >> >> 3- the addrvec size > 1 cases:
> > > >> >>
> > > >> >> - populate an addrvec with multiple addrs where one is legacy and others
> > > >> >> are not, and ensure that encoding with features 0 and decoding as an addr
> > > >> >> gives back the legacy item.
> > > >> >>
> > > >> >> - multiple legacy addrs with features 0 should encode the first one in the
> > > >> >> list
> > > >> >>
> > > >> >> - a vec with all non-legacy addrs encoded with features 0 and decoded as
> > > >> >> addr should give back an addr that is not entity_addr_t() but is
> > > >> >> unusable/bogus.
> > > >> >>
> > > >> >> 4- the empty addrvec case:
> > > >> >>
> > > >> >> - encode with features 0 and decoding as addr should give back
> > > >> >> entity_addr_t().
> > > >> >>
> > > >> >> I think that's it?
> > > >> >>
> > > >> >> sage
> > > >> >>
> > > >> >>
> > > >> >>>
> > > >> >>> On Mon, Jun 6, 2016 at 4:28 PM, Junwang Zhao <zhjwpku@gmail.com> wrote:
> > > >> >>> > Hi, Sage,
> > > >> >>> >
> > > >> >>> > Sorry I did not see your reply util now, I will work on it.
> > > >> >>> >
> > > >> >>> > Thanks!
> > > >> >>> > Zhao
> > > >> >>> >
> > > >> >>> > On Fri, Jun 3, 2016 at 1:46 AM, Sage Weil <sage@newdream.net> wrote:
> > > >> >>> >> On Thu, 2 Jun 2016, Sage Weil wrote:
> > > >> >>> >>> Hi Zhao,
> > > >> >>> >>>
> > > >> >>> >>> On Tue, 31 May 2016, Junwang Zhao wrote:
> > > >> >>> >>> > Hi Sage & Haomai,
> > > >> >>> >>> >
> > > >> >>> >>> > I try to run './vstart.sh -d -x -n' to test the patches today, I got
> > > >> >>> >>> > some error like 'wrong node', after some work, I found the
> > > >> >>> >>> > error was introduced by this patch[0]. I change the code to
> > > >> >>> >>> > use the legacy decode and encode, but got stuck somewhere.
> > > >> >>> >>> >
> > > >> >>> >>> > --- a/src/msg/msg_types.h
> > > >> >>> >>> > +++ b/src/msg/msg_types.h
> > > >> >>> >>> > @@ -371,7 +371,7 @@ struct entity_addr_t {
> > > >> >>> >>> >    // broader study
> > > >> >>> >>> >
> > > >> >>> >>> >    void encode(bufferlist& bl, uint64_t features) const {
> > > >> >>> >>> > -    if ((features & CEPH_FEATURE_MSG_ADDR2) == 0) {
> > > >> >>> >>> > +    if ((features & CEPH_FEATURE_MSG_ADDR2) != 0) {
> > > >> >>> >>> >        ::encode((__u32)0, bl);
> > > >> >>> >>> >        ::encode(nonce, bl);
> > > >> >>> >>> >        sockaddr_storage ss = get_sockaddr_storage();
> > > >> >>> >>> >
> > > >> >>> >>> >
> > > >> >>> >>> > Can you please see the patch again and give some comments?
> > > >> >>> >>> >
> > > >> >>> >>> > [0]https://github.com/zhjwpku/ceph/commit/8934f038fdf28156038770e601e4a27f7a684390
> > > >> >>> >>>
> > > >> >>> >>> This was enough to make things work fo rme:
> > > >> >>> >>>
> > > >> >>> >>> diff --git a/src/msg/msg_types.h b/src/msg/msg_types.h
> > > >> >>> >>> index 9c521e6..0da0614 100644
> > > >> >>> >>> --- a/src/msg/msg_types.h
> > > >> >>> >>> +++ b/src/msg/msg_types.h
> > > >> >>> >>> @@ -199,8 +199,7 @@ WRITE_CLASS_ENCODER(ceph_sockaddr_storage)
> > > >> >>> >>>
> > > >> >>> >>>  struct entity_addr_t {
> > > >> >>> >>>    typedef enum {
> > > >> >>> >>> -    TYPE_NONE = 0,
> > > >> >>> >>> -    TYPE_LEGACY = 1,
> > > >> >>> >>> +    TYPE_LEGACY = 0,
> > > >> >>> >>>    } type_t;
> > > >> >>> >>>
> > > >> >>> >>>    __u32 type;
> > > >> >>> >>> @@ -211,7 +210,7 @@ struct entity_addr_t {
> > > >> >>> >>>      sockaddr_in6 sin6;
> > > >> >>> >>>    } u;
> > > >> >>> >>>
> > > >> >>> >>> -  entity_addr_t() : type(0), nonce(0) {
> > > >> >>> >>> +  entity_addr_t() : type(TYPE_LEGACY), nonce(0) {
> > > >> >>> >>>      memset(&u, 0, sizeof(u));
> > > >> >>> >>>    }
> > > >> >>> >>>    explicit entity_addr_t(const ceph_entity_addr &o) {
> > > >> >>> >>>
> > > >> >>> >>> but I think this is not the right fix.  I think we still want
> > > >> >>> >>> TYPE_NONE, TYPE_LEGACY, and (soon) TYPE_MSGR2, and that the core problem
> > > >> >>> >>> is places that are creating addrs aren't setting the type properly.  Going
> > > >> >>> >>> to play with this a bit...
> > > >> >>> >>
> > > >> >>> >> I pushed a wip-addr-more branch:
> > > >> >>> >>
> > > >> >>> >>         https://github.com/liewegas/ceph/commits/wip-addr-more
> > > >> >>> >>
> > > >> >>> >> This fixes the parser to take type prefixes, default to legacy, and
> > > >> >>> >> changes entity_addr_t() so that it renders are "-" instead of ":/0" (which
> > > >> >>> >> was weird anyway).
> > > >> >>> >>
> > > >> >>> >> I think what we need next are some unit tests in test/test_addrs.cc that
> > > >> >>> >> do things like encode a entity_addr_t with no features, decode as
> > > >> >>> >> entity_addrvec_t, or construct varous addrvecs, encode with no features,
> > > >> >>> >> an ensure a decode to entity_addr_t gives back the right addr from the
> > > >> >>> >> list.
> > > >> >>> >>
> > > >> >>> >> sage
> > > >> >>> --
> > > >> >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > > >> >>> the body of a message to majordomo@vger.kernel.org
> > > >> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > >> >>>
> > > >> >>>
> > > >>
> > > >>
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > >
> > >
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Haomai Wang June 21, 2016, 2:17 a.m. UTC | #13
On Tue, Jun 21, 2016 at 1:47 AM, Sage Weil <sage@newdream.net> wrote:
> On Thu, 16 Jun 2016, Haomai Wang wrote:
>> On Thu, Jun 16, 2016 at 5:17 AM, Sage Weil <sage@newdream.net> wrote:
>> >
>> > I've rebased this and pushed a new wip-addrvec branch, PR
>> >
>> >         https://github.com/ceph/ceph/pull/9726
>> >
>> > I wonder if a good way to ease into using this would be to make
>> > AsyncMessenger allow binding to multiple addresses (IPv4 and IPv6).
>> >
>> > Haomai, what do you think?
>> >
>> > That will help us fix the multiple-address issues in msg/async (for
>> > example, announcing ourselves using the correct addr depending on which
>> > socket we accepted the connection on).  That will be one less thing in the
>> > way to making msg/async speak msgr2 on some connections.
>>
>> Hmm, you mean in multi osd instance example, we need to make each osd
>> instance own a bind address?
>>
>> What's the problem if they use the same address, I guess the reason
>> should be we have existing codes rely on this independent address
>> potentially.
>
> We assume the addrs are unique (hence the nonce) all over the place.
>
> I wasn't thinking about the many-osds-in-one-process case, though.  A
> simpler step is to make a single process, single osd bind to both ipv4 and
> ipv6 addresses, and advertise both in the OSDMap.  (Then later we'll
> extent that to also have the legacy and msgr2 addrs.)

Oh~

>
> So msg/async will need to bind to multiple addrs, revise the get_myaddr
> interface to return a vector of addresses, and put that in the OSDMap.
> And when accepting a connection, we mark the Connection/Pipe to indicate
> which socket it connected on, and adjust the addr we advertise with
> the initial handshake accordingly.
>
> I suspect there is a bunch of annoying cleanup around all things
> myaddr...

Hmm, if maintain two or more bind address in the same Messenger, it
will add complexities to messenger itself and caller. Like we need to
impl namespace in messenger. If we only want to separate v1/v2
address, we could use two Messenger in caller? It looks simple and
less to change existing codes. AsyncMessenger is onlly a collection of
connections without adding thread, so it would be cheap to add extra
messenger.

>
> sage
>
>
>>
>> >
>> >
>> > I started working on another branch that cleans up and refactors some of
>> > the authentication callbacks from the msgr into osd/mds/mon code so that
>> > we can accomodate both msgr1 and msgr2 authentication.
>> >
>> >         https://github.com/ceph/ceph/pull/9727
>> >
>> > A bit more cleanup still to do there, but soon I think we'll be to the
>> > point where we need the msg/async half that implements the protocol so we
>> > can wire the two together.
>>
>>
>> ok...
>>
>> >
>> >
>> > sage
>> >
>> >
>> > On Sat, 11 Jun 2016, Junwang Zhao wrote:
>> >
>> > > Hi Sage,
>> > >
>> > > I have modified the patch to make it PASS the unittest, you may want
>> > > to have a look and cherry-pick the updated commit[0].
>> > >
>> > > Thanks,
>> > > Zhao
>> > >
>> > > [0] https://github.com/zhjwpku/ceph/commit/b1f1183e3e2f39bbdbfa326d0dfc6f9b95611321
>> > >
>> > > On Sat, Jun 11, 2016 at 1:07 AM, Sage Weil <sage@newdream.net> wrote:
>> > > > On Fri, 10 Jun 2016, Junwang Zhao wrote:
>> > > >> Hi Sage,
>> > > >>
>> > > >> Updated and pushed to another branch, see [0], the 'make check'
>> > > >> shows that the test cases FAILS, could you tell me how to debug
>> > > >> these unittest? I hope to see the result using "cout" but I don't know
>> > > >> where those output goes to :(
>> > > >>
>> > > >> [0]https://github.com/zhjwpku/ceph/commit/e7307deae01fd6b675350fdae9c493eae04fdde9
>> > > >
>> > > > I fixed a problem in the messengesr that was breaking the addr learning,
>> > > > and thus a bunch of the unit tests that try to run test clusters.  I also
>> > > > rebased it on master.  Pushed an updated wip-addr-more to my git tree
>> > > > (liewegas/ceph)!
>> > > >
>> > > > sage
>> > > >
>> > > >>
>> > > >> Thanks,
>> > > >> Zhao
>> > > >>
>> > > >> On Fri, Jun 10, 2016 at 10:37 AM, Junwang Zhao <zhjwpku@gmail.com> wrote:
>> > > >> > Hi Sage,
>> > > >> >
>> > > >> > I have go a little bit further, the cases you mentioned really make sense,
>> > > >> > please see [0] and make some comments, I am still not good at writting
>> > > >> > the unittest.
>> > > >> >
>> > > >> > [0]https://github.com/zhjwpku/ceph/commit/4da16085a650f1dd0b743560836fdf213507d3b4
>> > > >> >
>> > > >> > Thanks!
>> > > >> > Zhao
>> > > >> >
>> > > >> > On Fri, Jun 10, 2016 at 12:10 AM, Sage Weil <sage@newdream.net> wrote:
>> > > >> >> On Wed, 8 Jun 2016, Junwang Zhao wrote:
>> > > >> >>> Hi Sage,
>> > > >> >>>
>> > > >> >>> Write some unittest, but seems not right, see [0], give some comments
>> > > >> >>> please, I use `cout` to print the value, but I don't know where to see them.
>> > > >> >>>
>> > > >> >>> I got a lot of other "FAIL" in the wip-addr-more branch, like:
>> > > >> >>>   ceph-detect-init/run-tox.sh
>> > > >> >>>   ceph-disk/run-tox.sh
>> > > >> >>>   test/run-rbd-unit-tests.sh
>> > > >> >>> I am wondering is this because my working environment? My laptop
>> > > >> >>> is really not very powerful, "i5-3210M CPU, 8G memory" :(
>> > > >> >>> make check is quite slow.
>> > > >> >>>
>> > > >> >>> [0] https://github.com/zhjwpku/ceph/commit/e15a240d56c2b67a8b0906dc80650a91bff1518f
>> > > >> >>
>> > > >> >> I'm traveling today and not able to replicate this yet--I'll take a look
>> > > >> >> tomorrow.
>> > > >> >>
>> > > >> >> In the meantime, though, it doesn't look like your new tests are doing
>> > > >> >> quite what I think we need them to.  I think the important cases are:
>> > > >> >>
>> > > >> >> 1- make sure that any encoded entity_addr_t can be decoded as a
>> > > >> >> entity_addrvec_t, with size 1, where item 0 matches the addr.  Can use the
>> > > >> >> existing arrays of addrs to parse to test this.
>> > > >> >>
>> > > >> >> 2- make sure that in all of those cases, if we encode the entity_addrvec_t
>> > > >> >> with features 0, and decode as an entity_addr_t, we get back the same
>> > > >> >> entity_addr_t.
>> > > >> >>
>> > > >> >> 3- the addrvec size > 1 cases:
>> > > >> >>
>> > > >> >> - populate an addrvec with multiple addrs where one is legacy and others
>> > > >> >> are not, and ensure that encoding with features 0 and decoding as an addr
>> > > >> >> gives back the legacy item.
>> > > >> >>
>> > > >> >> - multiple legacy addrs with features 0 should encode the first one in the
>> > > >> >> list
>> > > >> >>
>> > > >> >> - a vec with all non-legacy addrs encoded with features 0 and decoded as
>> > > >> >> addr should give back an addr that is not entity_addr_t() but is
>> > > >> >> unusable/bogus.
>> > > >> >>
>> > > >> >> 4- the empty addrvec case:
>> > > >> >>
>> > > >> >> - encode with features 0 and decoding as addr should give back
>> > > >> >> entity_addr_t().
>> > > >> >>
>> > > >> >> I think that's it?
>> > > >> >>
>> > > >> >> sage
>> > > >> >>
>> > > >> >>
>> > > >> >>>
>> > > >> >>> On Mon, Jun 6, 2016 at 4:28 PM, Junwang Zhao <zhjwpku@gmail.com> wrote:
>> > > >> >>> > Hi, Sage,
>> > > >> >>> >
>> > > >> >>> > Sorry I did not see your reply util now, I will work on it.
>> > > >> >>> >
>> > > >> >>> > Thanks!
>> > > >> >>> > Zhao
>> > > >> >>> >
>> > > >> >>> > On Fri, Jun 3, 2016 at 1:46 AM, Sage Weil <sage@newdream.net> wrote:
>> > > >> >>> >> On Thu, 2 Jun 2016, Sage Weil wrote:
>> > > >> >>> >>> Hi Zhao,
>> > > >> >>> >>>
>> > > >> >>> >>> On Tue, 31 May 2016, Junwang Zhao wrote:
>> > > >> >>> >>> > Hi Sage & Haomai,
>> > > >> >>> >>> >
>> > > >> >>> >>> > I try to run './vstart.sh -d -x -n' to test the patches today, I got
>> > > >> >>> >>> > some error like 'wrong node', after some work, I found the
>> > > >> >>> >>> > error was introduced by this patch[0]. I change the code to
>> > > >> >>> >>> > use the legacy decode and encode, but got stuck somewhere.
>> > > >> >>> >>> >
>> > > >> >>> >>> > --- a/src/msg/msg_types.h
>> > > >> >>> >>> > +++ b/src/msg/msg_types.h
>> > > >> >>> >>> > @@ -371,7 +371,7 @@ struct entity_addr_t {
>> > > >> >>> >>> >    // broader study
>> > > >> >>> >>> >
>> > > >> >>> >>> >    void encode(bufferlist& bl, uint64_t features) const {
>> > > >> >>> >>> > -    if ((features & CEPH_FEATURE_MSG_ADDR2) == 0) {
>> > > >> >>> >>> > +    if ((features & CEPH_FEATURE_MSG_ADDR2) != 0) {
>> > > >> >>> >>> >        ::encode((__u32)0, bl);
>> > > >> >>> >>> >        ::encode(nonce, bl);
>> > > >> >>> >>> >        sockaddr_storage ss = get_sockaddr_storage();
>> > > >> >>> >>> >
>> > > >> >>> >>> >
>> > > >> >>> >>> > Can you please see the patch again and give some comments?
>> > > >> >>> >>> >
>> > > >> >>> >>> > [0]https://github.com/zhjwpku/ceph/commit/8934f038fdf28156038770e601e4a27f7a684390
>> > > >> >>> >>>
>> > > >> >>> >>> This was enough to make things work fo rme:
>> > > >> >>> >>>
>> > > >> >>> >>> diff --git a/src/msg/msg_types.h b/src/msg/msg_types.h
>> > > >> >>> >>> index 9c521e6..0da0614 100644
>> > > >> >>> >>> --- a/src/msg/msg_types.h
>> > > >> >>> >>> +++ b/src/msg/msg_types.h
>> > > >> >>> >>> @@ -199,8 +199,7 @@ WRITE_CLASS_ENCODER(ceph_sockaddr_storage)
>> > > >> >>> >>>
>> > > >> >>> >>>  struct entity_addr_t {
>> > > >> >>> >>>    typedef enum {
>> > > >> >>> >>> -    TYPE_NONE = 0,
>> > > >> >>> >>> -    TYPE_LEGACY = 1,
>> > > >> >>> >>> +    TYPE_LEGACY = 0,
>> > > >> >>> >>>    } type_t;
>> > > >> >>> >>>
>> > > >> >>> >>>    __u32 type;
>> > > >> >>> >>> @@ -211,7 +210,7 @@ struct entity_addr_t {
>> > > >> >>> >>>      sockaddr_in6 sin6;
>> > > >> >>> >>>    } u;
>> > > >> >>> >>>
>> > > >> >>> >>> -  entity_addr_t() : type(0), nonce(0) {
>> > > >> >>> >>> +  entity_addr_t() : type(TYPE_LEGACY), nonce(0) {
>> > > >> >>> >>>      memset(&u, 0, sizeof(u));
>> > > >> >>> >>>    }
>> > > >> >>> >>>    explicit entity_addr_t(const ceph_entity_addr &o) {
>> > > >> >>> >>>
>> > > >> >>> >>> but I think this is not the right fix.  I think we still want
>> > > >> >>> >>> TYPE_NONE, TYPE_LEGACY, and (soon) TYPE_MSGR2, and that the core problem
>> > > >> >>> >>> is places that are creating addrs aren't setting the type properly.  Going
>> > > >> >>> >>> to play with this a bit...
>> > > >> >>> >>
>> > > >> >>> >> I pushed a wip-addr-more branch:
>> > > >> >>> >>
>> > > >> >>> >>         https://github.com/liewegas/ceph/commits/wip-addr-more
>> > > >> >>> >>
>> > > >> >>> >> This fixes the parser to take type prefixes, default to legacy, and
>> > > >> >>> >> changes entity_addr_t() so that it renders are "-" instead of ":/0" (which
>> > > >> >>> >> was weird anyway).
>> > > >> >>> >>
>> > > >> >>> >> I think what we need next are some unit tests in test/test_addrs.cc that
>> > > >> >>> >> do things like encode a entity_addr_t with no features, decode as
>> > > >> >>> >> entity_addrvec_t, or construct varous addrvecs, encode with no features,
>> > > >> >>> >> an ensure a decode to entity_addr_t gives back the right addr from the
>> > > >> >>> >> list.
>> > > >> >>> >>
>> > > >> >>> >> sage
>> > > >> >>> --
>> > > >> >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> > > >> >>> the body of a message to majordomo@vger.kernel.org
>> > > >> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> > > >> >>>
>> > > >> >>>
>> > > >>
>> > > >>
>> > > --
>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> > > the body of a message to majordomo@vger.kernel.org
>> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> > >
>> > >
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/src/msg/msg_types.h b/src/msg/msg_types.h
index 9c521e6..0da0614 100644
--- a/src/msg/msg_types.h
+++ b/src/msg/msg_types.h
@@ -199,8 +199,7 @@  WRITE_CLASS_ENCODER(ceph_sockaddr_storage)
 
 struct entity_addr_t {
   typedef enum {
-    TYPE_NONE = 0,
-    TYPE_LEGACY = 1,
+    TYPE_LEGACY = 0,
   } type_t;
 
   __u32 type;
@@ -211,7 +210,7 @@  struct entity_addr_t {
     sockaddr_in6 sin6;
   } u;
 
-  entity_addr_t() : type(0), nonce(0) { 
+  entity_addr_t() : type(TYPE_LEGACY), nonce(0) {
     memset(&u, 0, sizeof(u));
   }
   explicit entity_addr_t(const ceph_entity_addr &o) {