diff mbox series

[v3] lightnvm: pblk: ignore bad block wp for pblk_line_wp_is_unbalanced

Message ID 1548359553-11356-1-git-send-email-zjwu@marvell.com (mailing list archive)
State New, archived
Headers show
Series [v3] lightnvm: pblk: ignore bad block wp for pblk_line_wp_is_unbalanced | expand

Commit Message

Zhoujie Wu Jan. 24, 2019, 7:52 p.m. UTC
The write pointer of the bad block could be 0 or undefined, ignore
the checking of the bad block wp for pblk_line_wp_is_unbalanced to
avoid fake warning.

Signed-off-by: Zhoujie Wu <zjwu@marvell.com>
---
v3: return in case bit >= lm->blk_per_line.
v2: changed according to Javier's comments.

 drivers/lightnvm/pblk-recovery.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

Comments

Hans Holmberg Jan. 25, 2019, 8:47 a.m. UTC | #1
On Thu, Jan 24, 2019 at 8:51 PM Zhoujie Wu <zjwu@marvell.com> wrote:
>
> The write pointer of the bad block could be 0 or undefined, ignore
> the checking of the bad block wp for pblk_line_wp_is_unbalanced to
> avoid fake warning.

fake -> spurious?

>
> Signed-off-by: Zhoujie Wu <zjwu@marvell.com>
> ---
> v3: return in case bit >= lm->blk_per_line.
> v2: changed according to Javier's comments.
>
>  drivers/lightnvm/pblk-recovery.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
> index 6761d2a..02d466e 100644
> --- a/drivers/lightnvm/pblk-recovery.c
> +++ b/drivers/lightnvm/pblk-recovery.c
> @@ -312,21 +312,27 @@ static int pblk_line_wp_is_unbalanced(struct pblk *pblk,
>         struct nvm_chk_meta *chunk;
>         struct ppa_addr ppa;
>         u64 line_wp;
> -       int pos, i;
> +       int pos, i, bit;

We don't need both bit and i, one of them is enough.

>
> -       rlun = &pblk->luns[0];
> +       bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
> +       if (bit >= lm->blk_per_line)
> +               return 0;

If there is only one non-offline chunk in the line, the wp can't be unbalanced,
so it should be safe to return 0 here if bit >= lm->blk_per_line - 1

If you change this please document why using a comment, as it might
not be obvious

> +       rlun = &pblk->luns[bit];
>         ppa = rlun->bppa;
>         pos = pblk_ppa_to_pos(geo, ppa);
>         chunk = &line->chks[pos];
>

>         line_wp = chunk->wp;
>
> -       for (i = 1; i < lm->blk_per_line; i++) {
> +       for (i = bit + 1; i < lm->blk_per_line; i++) {

>                 rlun = &pblk->luns[i];
>                 ppa = rlun->bppa;
>                 pos = pblk_ppa_to_pos(geo, ppa);
>                 chunk = &line->chks[pos];

This code is a copy of the code above, it'd be nice to refactor it
into a helper function or just do the chunk lookups in one place.

>
> +               if (chunk->state & NVM_CHK_ST_OFFLINE)
> +                       continue;
> +

Since we rely on the block bitmap anyway, we might as well just
iterate over the zeroes in the block bitmap using find_next_zero_bit
instead.
We do this in lots of other places, see: git grep -n
find_next_zero_bit -- drivers/lightnvm

Apart from these nitpicks, the change looks good to me. Great to have
this fixed.

>                 if (chunk->wp > line_wp)
>                         return 1;
>                 else if (chunk->wp < line_wp)
> --
> 1.9.1
>
Javier González Jan. 25, 2019, 9:41 a.m. UTC | #2
> On 25 Jan 2019, at 09.47, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
> 
> On Thu, Jan 24, 2019 at 8:51 PM Zhoujie Wu <zjwu@marvell.com> wrote:
>> The write pointer of the bad block could be 0 or undefined, ignore
>> the checking of the bad block wp for pblk_line_wp_is_unbalanced to
>> avoid fake warning.
> 
> fake -> spurious?
> 
>> Signed-off-by: Zhoujie Wu <zjwu@marvell.com>
>> ---
>> v3: return in case bit >= lm->blk_per_line.
>> v2: changed according to Javier's comments.
>> 
>> drivers/lightnvm/pblk-recovery.c | 12 +++++++++---
>> 1 file changed, 9 insertions(+), 3 deletions(-)
>> 
>> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
>> index 6761d2a..02d466e 100644
>> --- a/drivers/lightnvm/pblk-recovery.c
>> +++ b/drivers/lightnvm/pblk-recovery.c
>> @@ -312,21 +312,27 @@ static int pblk_line_wp_is_unbalanced(struct pblk *pblk,
>>        struct nvm_chk_meta *chunk;
>>        struct ppa_addr ppa;
>>        u64 line_wp;
>> -       int pos, i;
>> +       int pos, i, bit;
> 
> We don't need both bit and i, one of them is enough.
> 
>> -       rlun = &pblk->luns[0];
>> +       bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
>> +       if (bit >= lm->blk_per_line)
>> +               return 0;
> 
> If there is only one non-offline chunk in the line, the wp can't be unbalanced,
> so it should be safe to return 0 here if bit >= lm->blk_per_line - 1
> 
> If you change this please document why using a comment, as it might
> not be obvious
> 
>> +       rlun = &pblk->luns[bit];
>>        ppa = rlun->bppa;
>>        pos = pblk_ppa_to_pos(geo, ppa);
>>        chunk = &line->chks[pos];
> 
>>        line_wp = chunk->wp;
>> 
>> -       for (i = 1; i < lm->blk_per_line; i++) {
>> +       for (i = bit + 1; i < lm->blk_per_line; i++) {
> 
>>                rlun = &pblk->luns[i];
>>                ppa = rlun->bppa;
>>                pos = pblk_ppa_to_pos(geo, ppa);
>>                chunk = &line->chks[pos];
> 
> This code is a copy of the code above, it'd be nice to refactor it
> into a helper function or just do the chunk lookups in one place.
> 
>> +               if (chunk->state & NVM_CHK_ST_OFFLINE)
>> +                       continue;
>> +
> 
> Since we rely on the block bitmap anyway, we might as well just
> iterate over the zeroes in the block bitmap using find_next_zero_bit
> instead.
> We do this in lots of other places, see: git grep -n
> find_next_zero_bit -- drivers/lightnvm
> 

Hans, I proposed him to use the chunk->state instead. I think it is way
more robust. We introduced the block bitmap for OCSSD 1.2, because there
was no state. Now that we have state, it is better to use it instead. In
fact, we should remove the bock bitmap as we have to check for the state
either way - note that this aligns also very well with you patches
removing the other bitmaps.

Javier
Hans Holmberg Jan. 25, 2019, 12:59 p.m. UTC | #3
On Fri, Jan 25, 2019 at 10:41 AM Javier González <javier@javigon.com> wrote:
>
>
> > On 25 Jan 2019, at 09.47, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
> >
> > On Thu, Jan 24, 2019 at 8:51 PM Zhoujie Wu <zjwu@marvell.com> wrote:
> >> The write pointer of the bad block could be 0 or undefined, ignore
> >> the checking of the bad block wp for pblk_line_wp_is_unbalanced to
> >> avoid fake warning.
> >
> > fake -> spurious?
> >
> >> Signed-off-by: Zhoujie Wu <zjwu@marvell.com>
> >> ---
> >> v3: return in case bit >= lm->blk_per_line.
> >> v2: changed according to Javier's comments.
> >>
> >> drivers/lightnvm/pblk-recovery.c | 12 +++++++++---
> >> 1 file changed, 9 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
> >> index 6761d2a..02d466e 100644
> >> --- a/drivers/lightnvm/pblk-recovery.c
> >> +++ b/drivers/lightnvm/pblk-recovery.c
> >> @@ -312,21 +312,27 @@ static int pblk_line_wp_is_unbalanced(struct pblk *pblk,
> >>        struct nvm_chk_meta *chunk;
> >>        struct ppa_addr ppa;
> >>        u64 line_wp;
> >> -       int pos, i;
> >> +       int pos, i, bit;
> >
> > We don't need both bit and i, one of them is enough.
> >
> >> -       rlun = &pblk->luns[0];
> >> +       bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
> >> +       if (bit >= lm->blk_per_line)
> >> +               return 0;
> >
> > If there is only one non-offline chunk in the line, the wp can't be unbalanced,
> > so it should be safe to return 0 here if bit >= lm->blk_per_line - 1
> >
> > If you change this please document why using a comment, as it might
> > not be obvious
> >
> >> +       rlun = &pblk->luns[bit];
> >>        ppa = rlun->bppa;
> >>        pos = pblk_ppa_to_pos(geo, ppa);
> >>        chunk = &line->chks[pos];
> >
> >>        line_wp = chunk->wp;
> >>
> >> -       for (i = 1; i < lm->blk_per_line; i++) {
> >> +       for (i = bit + 1; i < lm->blk_per_line; i++) {
> >
> >>                rlun = &pblk->luns[i];
> >>                ppa = rlun->bppa;
> >>                pos = pblk_ppa_to_pos(geo, ppa);
> >>                chunk = &line->chks[pos];
> >
> > This code is a copy of the code above, it'd be nice to refactor it
> > into a helper function or just do the chunk lookups in one place.
> >
> >> +               if (chunk->state & NVM_CHK_ST_OFFLINE)
> >> +                       continue;
> >> +
> >
> > Since we rely on the block bitmap anyway, we might as well just
> > iterate over the zeroes in the block bitmap using find_next_zero_bit
> > instead.
> > We do this in lots of other places, see: git grep -n
> > find_next_zero_bit -- drivers/lightnvm
> >
>
> Hans, I proposed him to use the chunk->state instead. I think it is way
> more robust. We introduced the block bitmap for OCSSD 1.2, because there
> was no state. Now that we have state, it is better to use it instead. In
> fact, we should remove the bock bitmap as we have to check for the state
> either way - note that this aligns also very well with you patches
> removing the other bitmaps.

These are just nitpicks.

Relying on two data structures(chunks, block bitmap) to be in sync in
this function in stead of one does not make it more robust imho.
Either or (checking chunks or the block bitmap) is fine by me.
Searching the bitmap is more efficient, so that is what I proposed.

If we want to remove the block bitmap, we can do that as a separate patch(set)
I do agree, having two copies of the chunk state is something worth
getting rid of :)

>
> Javier
Javier González Jan. 25, 2019, 1:33 p.m. UTC | #4
> On 25 Jan 2019, at 13.59, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
> 
> On Fri, Jan 25, 2019 at 10:41 AM Javier González <javier@javigon.com> wrote:
>>> On 25 Jan 2019, at 09.47, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
>>> 
>>> On Thu, Jan 24, 2019 at 8:51 PM Zhoujie Wu <zjwu@marvell.com> wrote:
>>>> The write pointer of the bad block could be 0 or undefined, ignore
>>>> the checking of the bad block wp for pblk_line_wp_is_unbalanced to
>>>> avoid fake warning.
>>> 
>>> fake -> spurious?
>>> 
>>>> Signed-off-by: Zhoujie Wu <zjwu@marvell.com>
>>>> ---
>>>> v3: return in case bit >= lm->blk_per_line.
>>>> v2: changed according to Javier's comments.
>>>> 
>>>> drivers/lightnvm/pblk-recovery.c | 12 +++++++++---
>>>> 1 file changed, 9 insertions(+), 3 deletions(-)
>>>> 
>>>> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
>>>> index 6761d2a..02d466e 100644
>>>> --- a/drivers/lightnvm/pblk-recovery.c
>>>> +++ b/drivers/lightnvm/pblk-recovery.c
>>>> @@ -312,21 +312,27 @@ static int pblk_line_wp_is_unbalanced(struct pblk *pblk,
>>>>       struct nvm_chk_meta *chunk;
>>>>       struct ppa_addr ppa;
>>>>       u64 line_wp;
>>>> -       int pos, i;
>>>> +       int pos, i, bit;
>>> 
>>> We don't need both bit and i, one of them is enough.
>>> 
>>>> -       rlun = &pblk->luns[0];
>>>> +       bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
>>>> +       if (bit >= lm->blk_per_line)
>>>> +               return 0;
>>> 
>>> If there is only one non-offline chunk in the line, the wp can't be unbalanced,
>>> so it should be safe to return 0 here if bit >= lm->blk_per_line - 1
>>> 
>>> If you change this please document why using a comment, as it might
>>> not be obvious
>>> 
>>>> +       rlun = &pblk->luns[bit];
>>>>       ppa = rlun->bppa;
>>>>       pos = pblk_ppa_to_pos(geo, ppa);
>>>>       chunk = &line->chks[pos];
>>> 
>>>>       line_wp = chunk->wp;
>>>> 
>>>> -       for (i = 1; i < lm->blk_per_line; i++) {
>>>> +       for (i = bit + 1; i < lm->blk_per_line; i++) {
>>> 
>>>>               rlun = &pblk->luns[i];
>>>>               ppa = rlun->bppa;
>>>>               pos = pblk_ppa_to_pos(geo, ppa);
>>>>               chunk = &line->chks[pos];
>>> 
>>> This code is a copy of the code above, it'd be nice to refactor it
>>> into a helper function or just do the chunk lookups in one place.
>>> 
>>>> +               if (chunk->state & NVM_CHK_ST_OFFLINE)
>>>> +                       continue;
>>>> +
>>> 
>>> Since we rely on the block bitmap anyway, we might as well just
>>> iterate over the zeroes in the block bitmap using find_next_zero_bit
>>> instead.
>>> We do this in lots of other places, see: git grep -n
>>> find_next_zero_bit -- drivers/lightnvm
>> 
>> Hans, I proposed him to use the chunk->state instead. I think it is way
>> more robust. We introduced the block bitmap for OCSSD 1.2, because there
>> was no state. Now that we have state, it is better to use it instead. In
>> fact, we should remove the bock bitmap as we have to check for the state
>> either way - note that this aligns also very well with you patches
>> removing the other bitmaps.
> 
> These are just nitpicks.
> 
> Relying on two data structures(chunks, block bitmap) to be in sync in
> this function in stead of one does not make it more robust imho.
> Either or (checking chunks or the block bitmap) is fine by me.
> Searching the bitmap is more efficient, so that is what I proposed.

chunk log page is the ground truth, so it is more robust.

Also, pblk has a long way to start seeing bitmap search vs. integer
comparisons in profiling.

> 
> If we want to remove the block bitmap, we can do that as a separate patch(set)
> I do agree, having two copies of the chunk state is something worth
> getting rid of :)
> 

A good start is not adding code using what we want to remove.

Javier
Hans Holmberg Jan. 25, 2019, 2:21 p.m. UTC | #5
On Fri, Jan 25, 2019 at 2:33 PM Javier González <javier@javigon.com> wrote:
>
>
> > On 25 Jan 2019, at 13.59, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
> >
> > On Fri, Jan 25, 2019 at 10:41 AM Javier González <javier@javigon.com> wrote:
> >>> On 25 Jan 2019, at 09.47, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
> >>>
> >>> On Thu, Jan 24, 2019 at 8:51 PM Zhoujie Wu <zjwu@marvell.com> wrote:
> >>>> The write pointer of the bad block could be 0 or undefined, ignore
> >>>> the checking of the bad block wp for pblk_line_wp_is_unbalanced to
> >>>> avoid fake warning.
> >>>
> >>> fake -> spurious?
> >>>
> >>>> Signed-off-by: Zhoujie Wu <zjwu@marvell.com>
> >>>> ---
> >>>> v3: return in case bit >= lm->blk_per_line.
> >>>> v2: changed according to Javier's comments.
> >>>>
> >>>> drivers/lightnvm/pblk-recovery.c | 12 +++++++++---
> >>>> 1 file changed, 9 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
> >>>> index 6761d2a..02d466e 100644
> >>>> --- a/drivers/lightnvm/pblk-recovery.c
> >>>> +++ b/drivers/lightnvm/pblk-recovery.c
> >>>> @@ -312,21 +312,27 @@ static int pblk_line_wp_is_unbalanced(struct pblk *pblk,
> >>>>       struct nvm_chk_meta *chunk;
> >>>>       struct ppa_addr ppa;
> >>>>       u64 line_wp;
> >>>> -       int pos, i;
> >>>> +       int pos, i, bit;
> >>>
> >>> We don't need both bit and i, one of them is enough.
> >>>
> >>>> -       rlun = &pblk->luns[0];
> >>>> +       bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
> >>>> +       if (bit >= lm->blk_per_line)
> >>>> +               return 0;
> >>>
> >>> If there is only one non-offline chunk in the line, the wp can't be unbalanced,
> >>> so it should be safe to return 0 here if bit >= lm->blk_per_line - 1
> >>>
> >>> If you change this please document why using a comment, as it might
> >>> not be obvious
> >>>
> >>>> +       rlun = &pblk->luns[bit];
> >>>>       ppa = rlun->bppa;
> >>>>       pos = pblk_ppa_to_pos(geo, ppa);
> >>>>       chunk = &line->chks[pos];
> >>>
> >>>>       line_wp = chunk->wp;
> >>>>
> >>>> -       for (i = 1; i < lm->blk_per_line; i++) {
> >>>> +       for (i = bit + 1; i < lm->blk_per_line; i++) {
> >>>
> >>>>               rlun = &pblk->luns[i];
> >>>>               ppa = rlun->bppa;
> >>>>               pos = pblk_ppa_to_pos(geo, ppa);
> >>>>               chunk = &line->chks[pos];
> >>>
> >>> This code is a copy of the code above, it'd be nice to refactor it
> >>> into a helper function or just do the chunk lookups in one place.
> >>>
> >>>> +               if (chunk->state & NVM_CHK_ST_OFFLINE)
> >>>> +                       continue;
> >>>> +
> >>>
> >>> Since we rely on the block bitmap anyway, we might as well just
> >>> iterate over the zeroes in the block bitmap using find_next_zero_bit
> >>> instead.
> >>> We do this in lots of other places, see: git grep -n
> >>> find_next_zero_bit -- drivers/lightnvm
> >>
> >> Hans, I proposed him to use the chunk->state instead. I think it is way
> >> more robust. We introduced the block bitmap for OCSSD 1.2, because there
> >> was no state. Now that we have state, it is better to use it instead. In
> >> fact, we should remove the bock bitmap as we have to check for the state
> >> either way - note that this aligns also very well with you patches
> >> removing the other bitmaps.
> >
> > These are just nitpicks.
> >
> > Relying on two data structures(chunks, block bitmap) to be in sync in
> > this function in stead of one does not make it more robust imho.
> > Either or (checking chunks or the block bitmap) is fine by me.
> > Searching the bitmap is more efficient, so that is what I proposed.
>
> chunk log page is the ground truth, so it is more robust.
>
> Also, pblk has a long way to start seeing bitmap search vs. integer
> comparisons in profiling.

Hehe, yeah, but it does not hurt to use the better alternative when available.

> >
> > If we want to remove the block bitmap, we can do that as a separate patch(set)
> > I do agree, having two copies of the chunk state is something worth
> > getting rid of :)
> >
>
> A good start is not adding code using what we want to remove.

Well, I think it's very confusing to use both copies in the same function.

Now we're nitpicking nitpicks :)
Matias Bjorling Jan. 25, 2019, 2:35 p.m. UTC | #6
On 1/25/19 3:21 PM, Hans Holmberg wrote:
> On Fri, Jan 25, 2019 at 2:33 PM Javier González <javier@javigon.com> wrote:
>>
>>
>>> On 25 Jan 2019, at 13.59, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
>>>
>>> On Fri, Jan 25, 2019 at 10:41 AM Javier González <javier@javigon.com> wrote:
>>>>> On 25 Jan 2019, at 09.47, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
>>>>>
>>>>> On Thu, Jan 24, 2019 at 8:51 PM Zhoujie Wu <zjwu@marvell.com> wrote:
>>>>>> The write pointer of the bad block could be 0 or undefined, ignore
>>>>>> the checking of the bad block wp for pblk_line_wp_is_unbalanced to
>>>>>> avoid fake warning.
>>>>>
>>>>> fake -> spurious?
>>>>>
>>>>>> Signed-off-by: Zhoujie Wu <zjwu@marvell.com>
>>>>>> ---
>>>>>> v3: return in case bit >= lm->blk_per_line.
>>>>>> v2: changed according to Javier's comments.
>>>>>>
>>>>>> drivers/lightnvm/pblk-recovery.c | 12 +++++++++---
>>>>>> 1 file changed, 9 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
>>>>>> index 6761d2a..02d466e 100644
>>>>>> --- a/drivers/lightnvm/pblk-recovery.c
>>>>>> +++ b/drivers/lightnvm/pblk-recovery.c
>>>>>> @@ -312,21 +312,27 @@ static int pblk_line_wp_is_unbalanced(struct pblk *pblk,
>>>>>>        struct nvm_chk_meta *chunk;
>>>>>>        struct ppa_addr ppa;
>>>>>>        u64 line_wp;
>>>>>> -       int pos, i;
>>>>>> +       int pos, i, bit;
>>>>>
>>>>> We don't need both bit and i, one of them is enough.
>>>>>
>>>>>> -       rlun = &pblk->luns[0];
>>>>>> +       bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
>>>>>> +       if (bit >= lm->blk_per_line)
>>>>>> +               return 0;
>>>>>
>>>>> If there is only one non-offline chunk in the line, the wp can't be unbalanced,
>>>>> so it should be safe to return 0 here if bit >= lm->blk_per_line - 1
>>>>>
>>>>> If you change this please document why using a comment, as it might
>>>>> not be obvious
>>>>>
>>>>>> +       rlun = &pblk->luns[bit];
>>>>>>        ppa = rlun->bppa;
>>>>>>        pos = pblk_ppa_to_pos(geo, ppa);
>>>>>>        chunk = &line->chks[pos];
>>>>>
>>>>>>        line_wp = chunk->wp;
>>>>>>
>>>>>> -       for (i = 1; i < lm->blk_per_line; i++) {
>>>>>> +       for (i = bit + 1; i < lm->blk_per_line; i++) {
>>>>>
>>>>>>                rlun = &pblk->luns[i];
>>>>>>                ppa = rlun->bppa;
>>>>>>                pos = pblk_ppa_to_pos(geo, ppa);
>>>>>>                chunk = &line->chks[pos];
>>>>>
>>>>> This code is a copy of the code above, it'd be nice to refactor it
>>>>> into a helper function or just do the chunk lookups in one place.
>>>>>
>>>>>> +               if (chunk->state & NVM_CHK_ST_OFFLINE)
>>>>>> +                       continue;
>>>>>> +
>>>>>
>>>>> Since we rely on the block bitmap anyway, we might as well just
>>>>> iterate over the zeroes in the block bitmap using find_next_zero_bit
>>>>> instead.
>>>>> We do this in lots of other places, see: git grep -n
>>>>> find_next_zero_bit -- drivers/lightnvm
>>>>
>>>> Hans, I proposed him to use the chunk->state instead. I think it is way
>>>> more robust. We introduced the block bitmap for OCSSD 1.2, because there
>>>> was no state. Now that we have state, it is better to use it instead. In
>>>> fact, we should remove the bock bitmap as we have to check for the state
>>>> either way - note that this aligns also very well with you patches
>>>> removing the other bitmaps.
>>>
>>> These are just nitpicks.
>>>
>>> Relying on two data structures(chunks, block bitmap) to be in sync in
>>> this function in stead of one does not make it more robust imho.
>>> Either or (checking chunks or the block bitmap) is fine by me.
>>> Searching the bitmap is more efficient, so that is what I proposed.
>>
>> chunk log page is the ground truth, so it is more robust.
>>
>> Also, pblk has a long way to start seeing bitmap search vs. integer
>> comparisons in profiling.
> 
> Hehe, yeah, but it does not hurt to use the better alternative when available.
> 
>>>
>>> If we want to remove the block bitmap, we can do that as a separate patch(set)
>>> I do agree, having two copies of the chunk state is something worth
>>> getting rid of :)
>>>
>>
>> A good start is not adding code using what we want to remove.
> 
> Well, I think it's very confusing to use both copies in the same function.
> 
> Now we're nitpicking nitpicks :)
> 

I look forward to a patch. Will one of you volunteer a patch?
Hans Holmberg Jan. 25, 2019, 4:46 p.m. UTC | #7
On Fri, Jan 25, 2019 at 3:35 PM Matias Bjørling <mb@lightnvm.io> wrote:
>
> On 1/25/19 3:21 PM, Hans Holmberg wrote:
> > On Fri, Jan 25, 2019 at 2:33 PM Javier González <javier@javigon.com> wrote:
> >>
> >>
> >>> On 25 Jan 2019, at 13.59, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
> >>>
> >>> On Fri, Jan 25, 2019 at 10:41 AM Javier González <javier@javigon.com> wrote:
> >>>>> On 25 Jan 2019, at 09.47, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
> >>>>>
> >>>>> On Thu, Jan 24, 2019 at 8:51 PM Zhoujie Wu <zjwu@marvell.com> wrote:
> >>>>>> The write pointer of the bad block could be 0 or undefined, ignore
> >>>>>> the checking of the bad block wp for pblk_line_wp_is_unbalanced to
> >>>>>> avoid fake warning.
> >>>>>
> >>>>> fake -> spurious?
> >>>>>
> >>>>>> Signed-off-by: Zhoujie Wu <zjwu@marvell.com>
> >>>>>> ---
> >>>>>> v3: return in case bit >= lm->blk_per_line.
> >>>>>> v2: changed according to Javier's comments.
> >>>>>>
> >>>>>> drivers/lightnvm/pblk-recovery.c | 12 +++++++++---
> >>>>>> 1 file changed, 9 insertions(+), 3 deletions(-)
> >>>>>>
> >>>>>> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
> >>>>>> index 6761d2a..02d466e 100644
> >>>>>> --- a/drivers/lightnvm/pblk-recovery.c
> >>>>>> +++ b/drivers/lightnvm/pblk-recovery.c
> >>>>>> @@ -312,21 +312,27 @@ static int pblk_line_wp_is_unbalanced(struct pblk *pblk,
> >>>>>>        struct nvm_chk_meta *chunk;
> >>>>>>        struct ppa_addr ppa;
> >>>>>>        u64 line_wp;
> >>>>>> -       int pos, i;
> >>>>>> +       int pos, i, bit;
> >>>>>
> >>>>> We don't need both bit and i, one of them is enough.
> >>>>>
> >>>>>> -       rlun = &pblk->luns[0];
> >>>>>> +       bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
> >>>>>> +       if (bit >= lm->blk_per_line)
> >>>>>> +               return 0;
> >>>>>
> >>>>> If there is only one non-offline chunk in the line, the wp can't be unbalanced,
> >>>>> so it should be safe to return 0 here if bit >= lm->blk_per_line - 1
> >>>>>
> >>>>> If you change this please document why using a comment, as it might
> >>>>> not be obvious
> >>>>>
> >>>>>> +       rlun = &pblk->luns[bit];
> >>>>>>        ppa = rlun->bppa;
> >>>>>>        pos = pblk_ppa_to_pos(geo, ppa);
> >>>>>>        chunk = &line->chks[pos];
> >>>>>
> >>>>>>        line_wp = chunk->wp;
> >>>>>>
> >>>>>> -       for (i = 1; i < lm->blk_per_line; i++) {
> >>>>>> +       for (i = bit + 1; i < lm->blk_per_line; i++) {
> >>>>>
> >>>>>>                rlun = &pblk->luns[i];
> >>>>>>                ppa = rlun->bppa;
> >>>>>>                pos = pblk_ppa_to_pos(geo, ppa);
> >>>>>>                chunk = &line->chks[pos];
> >>>>>
> >>>>> This code is a copy of the code above, it'd be nice to refactor it
> >>>>> into a helper function or just do the chunk lookups in one place.
> >>>>>
> >>>>>> +               if (chunk->state & NVM_CHK_ST_OFFLINE)
> >>>>>> +                       continue;
> >>>>>> +
> >>>>>
> >>>>> Since we rely on the block bitmap anyway, we might as well just
> >>>>> iterate over the zeroes in the block bitmap using find_next_zero_bit
> >>>>> instead.
> >>>>> We do this in lots of other places, see: git grep -n
> >>>>> find_next_zero_bit -- drivers/lightnvm
> >>>>
> >>>> Hans, I proposed him to use the chunk->state instead. I think it is way
> >>>> more robust. We introduced the block bitmap for OCSSD 1.2, because there
> >>>> was no state. Now that we have state, it is better to use it instead. In
> >>>> fact, we should remove the bock bitmap as we have to check for the state
> >>>> either way - note that this aligns also very well with you patches
> >>>> removing the other bitmaps.
> >>>
> >>> These are just nitpicks.
> >>>
> >>> Relying on two data structures(chunks, block bitmap) to be in sync in
> >>> this function in stead of one does not make it more robust imho.
> >>> Either or (checking chunks or the block bitmap) is fine by me.
> >>> Searching the bitmap is more efficient, so that is what I proposed.
> >>
> >> chunk log page is the ground truth, so it is more robust.
> >>
> >> Also, pblk has a long way to start seeing bitmap search vs. integer
> >> comparisons in profiling.
> >
> > Hehe, yeah, but it does not hurt to use the better alternative when available.
> >
> >>>
> >>> If we want to remove the block bitmap, we can do that as a separate patch(set)
> >>> I do agree, having two copies of the chunk state is something worth
> >>> getting rid of :)
> >>>
> >>
> >> A good start is not adding code using what we want to remove.
> >
> > Well, I think it's very confusing to use both copies in the same function.
> >
> > Now we're nitpicking nitpicks :)
> >
>
> I look forward to a patch. Will one of you volunteer a patch?

Sure! It'd be easier to Illustrate what I mean with a patch.
Javier González Jan. 25, 2019, 6:33 p.m. UTC | #8
> On 25 Jan 2019, at 17.46, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
> 
>> On Fri, Jan 25, 2019 at 3:35 PM Matias Bjørling <mb@lightnvm.io> wrote:
>> 
>>> On 1/25/19 3:21 PM, Hans Holmberg wrote:
>>>> On Fri, Jan 25, 2019 at 2:33 PM Javier González <javier@javigon.com> wrote:
>>>> 
>>>> 
>>>>> On 25 Jan 2019, at 13.59, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
>>>>> 
>>>>> On Fri, Jan 25, 2019 at 10:41 AM Javier González <javier@javigon.com> wrote:
>>>>>>> On 25 Jan 2019, at 09.47, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
>>>>>>> 
>>>>>>> On Thu, Jan 24, 2019 at 8:51 PM Zhoujie Wu <zjwu@marvell.com> wrote:
>>>>>>>> The write pointer of the bad block could be 0 or undefined, ignore
>>>>>>>> the checking of the bad block wp for pblk_line_wp_is_unbalanced to
>>>>>>>> avoid fake warning.
>>>>>>> 
>>>>>>> fake -> spurious?
>>>>>>> 
>>>>>>>> Signed-off-by: Zhoujie Wu <zjwu@marvell.com>
>>>>>>>> ---
>>>>>>>> v3: return in case bit >= lm->blk_per_line.
>>>>>>>> v2: changed according to Javier's comments.
>>>>>>>> 
>>>>>>>> drivers/lightnvm/pblk-recovery.c | 12 +++++++++---
>>>>>>>> 1 file changed, 9 insertions(+), 3 deletions(-)
>>>>>>>> 
>>>>>>>> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
>>>>>>>> index 6761d2a..02d466e 100644
>>>>>>>> --- a/drivers/lightnvm/pblk-recovery.c
>>>>>>>> +++ b/drivers/lightnvm/pblk-recovery.c
>>>>>>>> @@ -312,21 +312,27 @@ static int pblk_line_wp_is_unbalanced(struct pblk *pblk,
>>>>>>>>       struct nvm_chk_meta *chunk;
>>>>>>>>       struct ppa_addr ppa;
>>>>>>>>       u64 line_wp;
>>>>>>>> -       int pos, i;
>>>>>>>> +       int pos, i, bit;
>>>>>>> 
>>>>>>> We don't need both bit and i, one of them is enough.
>>>>>>> 
>>>>>>>> -       rlun = &pblk->luns[0];
>>>>>>>> +       bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
>>>>>>>> +       if (bit >= lm->blk_per_line)
>>>>>>>> +               return 0;
>>>>>>> 
>>>>>>> If there is only one non-offline chunk in the line, the wp can't be unbalanced,
>>>>>>> so it should be safe to return 0 here if bit >= lm->blk_per_line - 1
>>>>>>> 
>>>>>>> If you change this please document why using a comment, as it might
>>>>>>> not be obvious
>>>>>>> 
>>>>>>>> +       rlun = &pblk->luns[bit];
>>>>>>>>       ppa = rlun->bppa;
>>>>>>>>       pos = pblk_ppa_to_pos(geo, ppa);
>>>>>>>>       chunk = &line->chks[pos];
>>>>>>> 
>>>>>>>>       line_wp = chunk->wp;
>>>>>>>> 
>>>>>>>> -       for (i = 1; i < lm->blk_per_line; i++) {
>>>>>>>> +       for (i = bit + 1; i < lm->blk_per_line; i++) {
>>>>>>> 
>>>>>>>>               rlun = &pblk->luns[i];
>>>>>>>>               ppa = rlun->bppa;
>>>>>>>>               pos = pblk_ppa_to_pos(geo, ppa);
>>>>>>>>               chunk = &line->chks[pos];
>>>>>>> 
>>>>>>> This code is a copy of the code above, it'd be nice to refactor it
>>>>>>> into a helper function or just do the chunk lookups in one place.
>>>>>>> 
>>>>>>>> +               if (chunk->state & NVM_CHK_ST_OFFLINE)
>>>>>>>> +                       continue;
>>>>>>>> +
>>>>>>> 
>>>>>>> Since we rely on the block bitmap anyway, we might as well just
>>>>>>> iterate over the zeroes in the block bitmap using find_next_zero_bit
>>>>>>> instead.
>>>>>>> We do this in lots of other places, see: git grep -n
>>>>>>> find_next_zero_bit -- drivers/lightnvm
>>>>>> 
>>>>>> Hans, I proposed him to use the chunk->state instead. I think it is way
>>>>>> more robust. We introduced the block bitmap for OCSSD 1.2, because there
>>>>>> was no state. Now that we have state, it is better to use it instead. In
>>>>>> fact, we should remove the bock bitmap as we have to check for the state
>>>>>> either way - note that this aligns also very well with you patches
>>>>>> removing the other bitmaps.
>>>>> 
>>>>> These are just nitpicks.
>>>>> 
>>>>> Relying on two data structures(chunks, block bitmap) to be in sync in
>>>>> this function in stead of one does not make it more robust imho.
>>>>> Either or (checking chunks or the block bitmap) is fine by me.
>>>>> Searching the bitmap is more efficient, so that is what I proposed.
>>>> 
>>>> chunk log page is the ground truth, so it is more robust.
>>>> 
>>>> Also, pblk has a long way to start seeing bitmap search vs. integer
>>>> comparisons in profiling.
>>> 
>>> Hehe, yeah, but it does not hurt to use the better alternative when available.
>>> 
>>>>> 
>>>>> If we want to remove the block bitmap, we can do that as a separate patch(set)
>>>>> I do agree, having two copies of the chunk state is something worth
>>>>> getting rid of :)
>>>>> 
>>>> 
>>>> A good start is not adding code using what we want to remove.
>>> 
>>> Well, I think it's very confusing to use both copies in the same function.
>>> 
>>> Now we're nitpicking nitpicks :)
>>> 
>> 
>> I look forward to a patch. Will one of you volunteer a patch?
> 
> Sure! It'd be easier to Illustrate what I mean with a patch.

Cool. Go ahead - you will see how much cleaner it is using the chunk info. 

Matias : In any case Zhoujie’s fix is orthogonal to this discussion and I think you should pick it up - in one form of bb iteration or another - as it fixes a real issue. 

Javier.
Hans Holmberg Jan. 25, 2019, 8:20 p.m. UTC | #9
On Fri, Jan 25, 2019 at 7:33 PM Javier González <javier@javigon.com> wrote:
>
>
>
> > On 25 Jan 2019, at 17.46, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
> >
> >> On Fri, Jan 25, 2019 at 3:35 PM Matias Bjørling <mb@lightnvm.io> wrote:
> >>
> >>> On 1/25/19 3:21 PM, Hans Holmberg wrote:
> >>>> On Fri, Jan 25, 2019 at 2:33 PM Javier González <javier@javigon.com> wrote:
> >>>>
> >>>>
> >>>>> On 25 Jan 2019, at 13.59, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
> >>>>>
> >>>>> On Fri, Jan 25, 2019 at 10:41 AM Javier González <javier@javigon.com> wrote:
> >>>>>>> On 25 Jan 2019, at 09.47, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
> >>>>>>>
> >>>>>>> On Thu, Jan 24, 2019 at 8:51 PM Zhoujie Wu <zjwu@marvell.com> wrote:
> >>>>>>>> The write pointer of the bad block could be 0 or undefined, ignore
> >>>>>>>> the checking of the bad block wp for pblk_line_wp_is_unbalanced to
> >>>>>>>> avoid fake warning.
> >>>>>>>
> >>>>>>> fake -> spurious?
> >>>>>>>
> >>>>>>>> Signed-off-by: Zhoujie Wu <zjwu@marvell.com>
> >>>>>>>> ---
> >>>>>>>> v3: return in case bit >= lm->blk_per_line.
> >>>>>>>> v2: changed according to Javier's comments.
> >>>>>>>>
> >>>>>>>> drivers/lightnvm/pblk-recovery.c | 12 +++++++++---
> >>>>>>>> 1 file changed, 9 insertions(+), 3 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
> >>>>>>>> index 6761d2a..02d466e 100644
> >>>>>>>> --- a/drivers/lightnvm/pblk-recovery.c
> >>>>>>>> +++ b/drivers/lightnvm/pblk-recovery.c
> >>>>>>>> @@ -312,21 +312,27 @@ static int pblk_line_wp_is_unbalanced(struct pblk *pblk,
> >>>>>>>>       struct nvm_chk_meta *chunk;
> >>>>>>>>       struct ppa_addr ppa;
> >>>>>>>>       u64 line_wp;
> >>>>>>>> -       int pos, i;
> >>>>>>>> +       int pos, i, bit;
> >>>>>>>
> >>>>>>> We don't need both bit and i, one of them is enough.
> >>>>>>>
> >>>>>>>> -       rlun = &pblk->luns[0];
> >>>>>>>> +       bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
> >>>>>>>> +       if (bit >= lm->blk_per_line)
> >>>>>>>> +               return 0;
> >>>>>>>
> >>>>>>> If there is only one non-offline chunk in the line, the wp can't be unbalanced,
> >>>>>>> so it should be safe to return 0 here if bit >= lm->blk_per_line - 1
> >>>>>>>
> >>>>>>> If you change this please document why using a comment, as it might
> >>>>>>> not be obvious
> >>>>>>>
> >>>>>>>> +       rlun = &pblk->luns[bit];
> >>>>>>>>       ppa = rlun->bppa;
> >>>>>>>>       pos = pblk_ppa_to_pos(geo, ppa);
> >>>>>>>>       chunk = &line->chks[pos];
> >>>>>>>
> >>>>>>>>       line_wp = chunk->wp;
> >>>>>>>>
> >>>>>>>> -       for (i = 1; i < lm->blk_per_line; i++) {
> >>>>>>>> +       for (i = bit + 1; i < lm->blk_per_line; i++) {
> >>>>>>>
> >>>>>>>>               rlun = &pblk->luns[i];
> >>>>>>>>               ppa = rlun->bppa;
> >>>>>>>>               pos = pblk_ppa_to_pos(geo, ppa);
> >>>>>>>>               chunk = &line->chks[pos];
> >>>>>>>
> >>>>>>> This code is a copy of the code above, it'd be nice to refactor it
> >>>>>>> into a helper function or just do the chunk lookups in one place.
> >>>>>>>
> >>>>>>>> +               if (chunk->state & NVM_CHK_ST_OFFLINE)
> >>>>>>>> +                       continue;
> >>>>>>>> +
> >>>>>>>
> >>>>>>> Since we rely on the block bitmap anyway, we might as well just
> >>>>>>> iterate over the zeroes in the block bitmap using find_next_zero_bit
> >>>>>>> instead.
> >>>>>>> We do this in lots of other places, see: git grep -n
> >>>>>>> find_next_zero_bit -- drivers/lightnvm
> >>>>>>
> >>>>>> Hans, I proposed him to use the chunk->state instead. I think it is way
> >>>>>> more robust. We introduced the block bitmap for OCSSD 1.2, because there
> >>>>>> was no state. Now that we have state, it is better to use it instead. In
> >>>>>> fact, we should remove the bock bitmap as we have to check for the state
> >>>>>> either way - note that this aligns also very well with you patches
> >>>>>> removing the other bitmaps.
> >>>>>
> >>>>> These are just nitpicks.
> >>>>>
> >>>>> Relying on two data structures(chunks, block bitmap) to be in sync in
> >>>>> this function in stead of one does not make it more robust imho.
> >>>>> Either or (checking chunks or the block bitmap) is fine by me.
> >>>>> Searching the bitmap is more efficient, so that is what I proposed.
> >>>>
> >>>> chunk log page is the ground truth, so it is more robust.
> >>>>
> >>>> Also, pblk has a long way to start seeing bitmap search vs. integer
> >>>> comparisons in profiling.
> >>>
> >>> Hehe, yeah, but it does not hurt to use the better alternative when available.
> >>>
> >>>>>
> >>>>> If we want to remove the block bitmap, we can do that as a separate patch(set)
> >>>>> I do agree, having two copies of the chunk state is something worth
> >>>>> getting rid of :)
> >>>>>
> >>>>
> >>>> A good start is not adding code using what we want to remove.
> >>>
> >>> Well, I think it's very confusing to use both copies in the same function.
> >>>
> >>> Now we're nitpicking nitpicks :)
> >>>
> >>
> >> I look forward to a patch. Will one of you volunteer a patch?
> >
> > Sure! It'd be easier to Illustrate what I mean with a patch.
>
> Cool. Go ahead - you will see how much cleaner it is using the chunk info.
>
> Matias : In any case Zhoujie’s fix is orthogonal to this discussion and I think you should pick it up - in one form of bb iteration or another - as it fixes a real issue.

Second that! I feel bad about the fix not making it in just because of
this squabbling.

> Javier.
Javier González Jan. 25, 2019, 9:30 p.m. UTC | #10
> On 25 Jan 2019, at 21.20, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
> 
>> On Fri, Jan 25, 2019 at 7:33 PM Javier González <javier@javigon.com> wrote:
>> 
>> 
>> 
>>>> On 25 Jan 2019, at 17.46, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
>>>> 
>>>>> On Fri, Jan 25, 2019 at 3:35 PM Matias Bjørling <mb@lightnvm.io> wrote:
>>>>> 
>>>>>> On 1/25/19 3:21 PM, Hans Holmberg wrote:
>>>>>> On Fri, Jan 25, 2019 at 2:33 PM Javier González <javier@javigon.com> wrote:
>>>>>> 
>>>>>> 
>>>>>>> On 25 Jan 2019, at 13.59, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
>>>>>>> 
>>>>>>> On Fri, Jan 25, 2019 at 10:41 AM Javier González <javier@javigon.com> wrote:
>>>>>>>>> On 25 Jan 2019, at 09.47, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
>>>>>>>>> 
>>>>>>>>> On Thu, Jan 24, 2019 at 8:51 PM Zhoujie Wu <zjwu@marvell.com> wrote:
>>>>>>>>>> The write pointer of the bad block could be 0 or undefined, ignore
>>>>>>>>>> the checking of the bad block wp for pblk_line_wp_is_unbalanced to
>>>>>>>>>> avoid fake warning.
>>>>>>>>> 
>>>>>>>>> fake -> spurious?
>>>>>>>>> 
>>>>>>>>>> Signed-off-by: Zhoujie Wu <zjwu@marvell.com>
>>>>>>>>>> ---
>>>>>>>>>> v3: return in case bit >= lm->blk_per_line.
>>>>>>>>>> v2: changed according to Javier's comments.
>>>>>>>>>> 
>>>>>>>>>> drivers/lightnvm/pblk-recovery.c | 12 +++++++++---
>>>>>>>>>> 1 file changed, 9 insertions(+), 3 deletions(-)
>>>>>>>>>> 
>>>>>>>>>> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
>>>>>>>>>> index 6761d2a..02d466e 100644
>>>>>>>>>> --- a/drivers/lightnvm/pblk-recovery.c
>>>>>>>>>> +++ b/drivers/lightnvm/pblk-recovery.c
>>>>>>>>>> @@ -312,21 +312,27 @@ static int pblk_line_wp_is_unbalanced(struct pblk *pblk,
>>>>>>>>>>      struct nvm_chk_meta *chunk;
>>>>>>>>>>      struct ppa_addr ppa;
>>>>>>>>>>      u64 line_wp;
>>>>>>>>>> -       int pos, i;
>>>>>>>>>> +       int pos, i, bit;
>>>>>>>>> 
>>>>>>>>> We don't need both bit and i, one of them is enough.
>>>>>>>>> 
>>>>>>>>>> -       rlun = &pblk->luns[0];
>>>>>>>>>> +       bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
>>>>>>>>>> +       if (bit >= lm->blk_per_line)
>>>>>>>>>> +               return 0;
>>>>>>>>> 
>>>>>>>>> If there is only one non-offline chunk in the line, the wp can't be unbalanced,
>>>>>>>>> so it should be safe to return 0 here if bit >= lm->blk_per_line - 1
>>>>>>>>> 
>>>>>>>>> If you change this please document why using a comment, as it might
>>>>>>>>> not be obvious
>>>>>>>>> 
>>>>>>>>>> +       rlun = &pblk->luns[bit];
>>>>>>>>>>      ppa = rlun->bppa;
>>>>>>>>>>      pos = pblk_ppa_to_pos(geo, ppa);
>>>>>>>>>>      chunk = &line->chks[pos];
>>>>>>>>> 
>>>>>>>>>>      line_wp = chunk->wp;
>>>>>>>>>> 
>>>>>>>>>> -       for (i = 1; i < lm->blk_per_line; i++) {
>>>>>>>>>> +       for (i = bit + 1; i < lm->blk_per_line; i++) {
>>>>>>>>> 
>>>>>>>>>>              rlun = &pblk->luns[i];
>>>>>>>>>>              ppa = rlun->bppa;
>>>>>>>>>>              pos = pblk_ppa_to_pos(geo, ppa);
>>>>>>>>>>              chunk = &line->chks[pos];
>>>>>>>>> 
>>>>>>>>> This code is a copy of the code above, it'd be nice to refactor it
>>>>>>>>> into a helper function or just do the chunk lookups in one place.
>>>>>>>>> 
>>>>>>>>>> +               if (chunk->state & NVM_CHK_ST_OFFLINE)
>>>>>>>>>> +                       continue;
>>>>>>>>>> +
>>>>>>>>> 
>>>>>>>>> Since we rely on the block bitmap anyway, we might as well just
>>>>>>>>> iterate over the zeroes in the block bitmap using find_next_zero_bit
>>>>>>>>> instead.
>>>>>>>>> We do this in lots of other places, see: git grep -n
>>>>>>>>> find_next_zero_bit -- drivers/lightnvm
>>>>>>>> 
>>>>>>>> Hans, I proposed him to use the chunk->state instead. I think it is way
>>>>>>>> more robust. We introduced the block bitmap for OCSSD 1.2, because there
>>>>>>>> was no state. Now that we have state, it is better to use it instead. In
>>>>>>>> fact, we should remove the bock bitmap as we have to check for the state
>>>>>>>> either way - note that this aligns also very well with you patches
>>>>>>>> removing the other bitmaps.
>>>>>>> 
>>>>>>> These are just nitpicks.
>>>>>>> 
>>>>>>> Relying on two data structures(chunks, block bitmap) to be in sync in
>>>>>>> this function in stead of one does not make it more robust imho.
>>>>>>> Either or (checking chunks or the block bitmap) is fine by me.
>>>>>>> Searching the bitmap is more efficient, so that is what I proposed.
>>>>>> 
>>>>>> chunk log page is the ground truth, so it is more robust.
>>>>>> 
>>>>>> Also, pblk has a long way to start seeing bitmap search vs. integer
>>>>>> comparisons in profiling.
>>>>> 
>>>>> Hehe, yeah, but it does not hurt to use the better alternative when available.
>>>>> 
>>>>>>> 
>>>>>>> If we want to remove the block bitmap, we can do that as a separate patch(set)
>>>>>>> I do agree, having two copies of the chunk state is something worth
>>>>>>> getting rid of :)
>>>>>>> 
>>>>>> 
>>>>>> A good start is not adding code using what we want to remove.
>>>>> 
>>>>> Well, I think it's very confusing to use both copies in the same function.
>>>>> 
>>>>> Now we're nitpicking nitpicks :)
>>>>> 
>>>> 
>>>> I look forward to a patch. Will one of you volunteer a patch?
>>> 
>>> Sure! It'd be easier to Illustrate what I mean with a patch.
>> 
>> Cool. Go ahead - you will see how much cleaner it is using the chunk info.
>> 
>> Matias : In any case Zhoujie’s fix is orthogonal to this discussion and I think you should pick it up - in one form of bb iteration or another - as it fixes a real issue.
> 
> Second that! I feel bad about the fix not making it in just because of
> this squabbling.

No squabbling :) We both care about pblk and want to get the code right. 

If you give that first pass to that patch, it would be great. We can take it from there. 

Nos enjoy the weekend!

Javier.
Hans Holmberg Jan. 28, 2019, 1:57 p.m. UTC | #11
Hi again,

while refactoring the code I realized that there is another case we
are not considering:
if the max and min write pointers across all non-bad chunks are off by
more than the write unit, we will not be able to continue writing to
the line.

I'm adding a check of this in my patch.

All the best,
Hans


On Fri, Jan 25, 2019 at 10:30 PM Javier González <javier@javigon.com> wrote:
>
>
> > On 25 Jan 2019, at 21.20, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
> >
> >> On Fri, Jan 25, 2019 at 7:33 PM Javier González <javier@javigon.com> wrote:
> >>
> >>
> >>
> >>>> On 25 Jan 2019, at 17.46, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
> >>>>
> >>>>> On Fri, Jan 25, 2019 at 3:35 PM Matias Bjørling <mb@lightnvm.io> wrote:
> >>>>>
> >>>>>> On 1/25/19 3:21 PM, Hans Holmberg wrote:
> >>>>>> On Fri, Jan 25, 2019 at 2:33 PM Javier González <javier@javigon.com> wrote:
> >>>>>>
> >>>>>>
> >>>>>>> On 25 Jan 2019, at 13.59, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
> >>>>>>>
> >>>>>>> On Fri, Jan 25, 2019 at 10:41 AM Javier González <javier@javigon.com> wrote:
> >>>>>>>>> On 25 Jan 2019, at 09.47, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
> >>>>>>>>>
> >>>>>>>>> On Thu, Jan 24, 2019 at 8:51 PM Zhoujie Wu <zjwu@marvell.com> wrote:
> >>>>>>>>>> The write pointer of the bad block could be 0 or undefined, ignore
> >>>>>>>>>> the checking of the bad block wp for pblk_line_wp_is_unbalanced to
> >>>>>>>>>> avoid fake warning.
> >>>>>>>>>
> >>>>>>>>> fake -> spurious?
> >>>>>>>>>
> >>>>>>>>>> Signed-off-by: Zhoujie Wu <zjwu@marvell.com>
> >>>>>>>>>> ---
> >>>>>>>>>> v3: return in case bit >= lm->blk_per_line.
> >>>>>>>>>> v2: changed according to Javier's comments.
> >>>>>>>>>>
> >>>>>>>>>> drivers/lightnvm/pblk-recovery.c | 12 +++++++++---
> >>>>>>>>>> 1 file changed, 9 insertions(+), 3 deletions(-)
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
> >>>>>>>>>> index 6761d2a..02d466e 100644
> >>>>>>>>>> --- a/drivers/lightnvm/pblk-recovery.c
> >>>>>>>>>> +++ b/drivers/lightnvm/pblk-recovery.c
> >>>>>>>>>> @@ -312,21 +312,27 @@ static int pblk_line_wp_is_unbalanced(struct pblk *pblk,
> >>>>>>>>>>      struct nvm_chk_meta *chunk;
> >>>>>>>>>>      struct ppa_addr ppa;
> >>>>>>>>>>      u64 line_wp;
> >>>>>>>>>> -       int pos, i;
> >>>>>>>>>> +       int pos, i, bit;
> >>>>>>>>>
> >>>>>>>>> We don't need both bit and i, one of them is enough.
> >>>>>>>>>
> >>>>>>>>>> -       rlun = &pblk->luns[0];
> >>>>>>>>>> +       bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
> >>>>>>>>>> +       if (bit >= lm->blk_per_line)
> >>>>>>>>>> +               return 0;
> >>>>>>>>>
> >>>>>>>>> If there is only one non-offline chunk in the line, the wp can't be unbalanced,
> >>>>>>>>> so it should be safe to return 0 here if bit >= lm->blk_per_line - 1
> >>>>>>>>>
> >>>>>>>>> If you change this please document why using a comment, as it might
> >>>>>>>>> not be obvious
> >>>>>>>>>
> >>>>>>>>>> +       rlun = &pblk->luns[bit];
> >>>>>>>>>>      ppa = rlun->bppa;
> >>>>>>>>>>      pos = pblk_ppa_to_pos(geo, ppa);
> >>>>>>>>>>      chunk = &line->chks[pos];
> >>>>>>>>>
> >>>>>>>>>>      line_wp = chunk->wp;
> >>>>>>>>>>
> >>>>>>>>>> -       for (i = 1; i < lm->blk_per_line; i++) {
> >>>>>>>>>> +       for (i = bit + 1; i < lm->blk_per_line; i++) {
> >>>>>>>>>
> >>>>>>>>>>              rlun = &pblk->luns[i];
> >>>>>>>>>>              ppa = rlun->bppa;
> >>>>>>>>>>              pos = pblk_ppa_to_pos(geo, ppa);
> >>>>>>>>>>              chunk = &line->chks[pos];
> >>>>>>>>>
> >>>>>>>>> This code is a copy of the code above, it'd be nice to refactor it
> >>>>>>>>> into a helper function or just do the chunk lookups in one place.
> >>>>>>>>>
> >>>>>>>>>> +               if (chunk->state & NVM_CHK_ST_OFFLINE)
> >>>>>>>>>> +                       continue;
> >>>>>>>>>> +
> >>>>>>>>>
> >>>>>>>>> Since we rely on the block bitmap anyway, we might as well just
> >>>>>>>>> iterate over the zeroes in the block bitmap using find_next_zero_bit
> >>>>>>>>> instead.
> >>>>>>>>> We do this in lots of other places, see: git grep -n
> >>>>>>>>> find_next_zero_bit -- drivers/lightnvm
> >>>>>>>>
> >>>>>>>> Hans, I proposed him to use the chunk->state instead. I think it is way
> >>>>>>>> more robust. We introduced the block bitmap for OCSSD 1.2, because there
> >>>>>>>> was no state. Now that we have state, it is better to use it instead. In
> >>>>>>>> fact, we should remove the bock bitmap as we have to check for the state
> >>>>>>>> either way - note that this aligns also very well with you patches
> >>>>>>>> removing the other bitmaps.
> >>>>>>>
> >>>>>>> These are just nitpicks.
> >>>>>>>
> >>>>>>> Relying on two data structures(chunks, block bitmap) to be in sync in
> >>>>>>> this function in stead of one does not make it more robust imho.
> >>>>>>> Either or (checking chunks or the block bitmap) is fine by me.
> >>>>>>> Searching the bitmap is more efficient, so that is what I proposed.
> >>>>>>
> >>>>>> chunk log page is the ground truth, so it is more robust.
> >>>>>>
> >>>>>> Also, pblk has a long way to start seeing bitmap search vs. integer
> >>>>>> comparisons in profiling.
> >>>>>
> >>>>> Hehe, yeah, but it does not hurt to use the better alternative when available.
> >>>>>
> >>>>>>>
> >>>>>>> If we want to remove the block bitmap, we can do that as a separate patch(set)
> >>>>>>> I do agree, having two copies of the chunk state is something worth
> >>>>>>> getting rid of :)
> >>>>>>>
> >>>>>>
> >>>>>> A good start is not adding code using what we want to remove.
> >>>>>
> >>>>> Well, I think it's very confusing to use both copies in the same function.
> >>>>>
> >>>>> Now we're nitpicking nitpicks :)
> >>>>>
> >>>>
> >>>> I look forward to a patch. Will one of you volunteer a patch?
> >>>
> >>> Sure! It'd be easier to Illustrate what I mean with a patch.
> >>
> >> Cool. Go ahead - you will see how much cleaner it is using the chunk info.
> >>
> >> Matias : In any case Zhoujie’s fix is orthogonal to this discussion and I think you should pick it up - in one form of bb iteration or another - as it fixes a real issue.
> >
> > Second that! I feel bad about the fix not making it in just because of
> > this squabbling.
>
> No squabbling :) We both care about pblk and want to get the code right.
>
> If you give that first pass to that patch, it would be great. We can take it from there.
>
> Nos enjoy the weekend!
>
> Javier.
diff mbox series

Patch

diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
index 6761d2a..02d466e 100644
--- a/drivers/lightnvm/pblk-recovery.c
+++ b/drivers/lightnvm/pblk-recovery.c
@@ -312,21 +312,27 @@  static int pblk_line_wp_is_unbalanced(struct pblk *pblk,
 	struct nvm_chk_meta *chunk;
 	struct ppa_addr ppa;
 	u64 line_wp;
-	int pos, i;
+	int pos, i, bit;
 
-	rlun = &pblk->luns[0];
+	bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
+	if (bit >= lm->blk_per_line)
+		return 0;
+	rlun = &pblk->luns[bit];
 	ppa = rlun->bppa;
 	pos = pblk_ppa_to_pos(geo, ppa);
 	chunk = &line->chks[pos];
 
 	line_wp = chunk->wp;
 
-	for (i = 1; i < lm->blk_per_line; i++) {
+	for (i = bit + 1; i < lm->blk_per_line; i++) {
 		rlun = &pblk->luns[i];
 		ppa = rlun->bppa;
 		pos = pblk_ppa_to_pos(geo, ppa);
 		chunk = &line->chks[pos];
 
+		if (chunk->state & NVM_CHK_ST_OFFLINE)
+			continue;
+
 		if (chunk->wp > line_wp)
 			return 1;
 		else if (chunk->wp < line_wp)