diff mbox

[1/5] do not corrupt ptrlist while killing unreachable BBs

Message ID CANeU7QnVNE38ODJ+cgUzD3Vz4XKogk-RL2v6-dS2aX9Oc4_0uQ@mail.gmail.com (mailing list archive)
State Rejected, archived
Headers show

Commit Message

Christopher Li July 7, 2017, 1:18 a.m. UTC
On Thu, Jul 6, 2017 at 5:40 PM, Christopher Li <sparse@chrisli.org> wrote:
>
> Thanks for catching it. That is a very serious bug.
> The release will be on hold until this bug has been fix.
> It is actually more than one of them. I will send out a separate patch
> catching this kind of behavior.
>

Here is the patch. There is another offender in simplify.c

Chris

From: Christopher Li <sparse@chrisli.org>
Date: Thu, 6 Jul 2017 16:25:38 -0700
Subject: [PATCH 2/2] check on delete list entry while parent caller using it.

This is checking for type the bug Luc reported.
Basiclly, the parent DO_FOR_EACH() has __nr caching the
current ptr position. When the inner loop function delete
one entry before the current position. The parent current
position needs to be adjust, because the entry[] has been
move forward.

This patch only check usage FOR_EACH_XXX macro. There is
also PREPARE_PTR_LIST haven't cover by this patch.
It is already catching bugs left and right.

Most noticablely remove_usage() inside of the  kill_use_list()
loop.
---
 ptrlist.h | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

  * Hey, who said that you can't do overloading in C?
@@ -158,6 +160,7 @@ static inline void *last_ptr_list(struct ptr_list *list)
  CHECK_TYPE(head,ptr); \
  if (__head) { \
  do { int __nr; \
+ __list->active++; \
  for (__nr = 0; __nr < __list->nr; __nr++) { \
  do { \
  ptr = PTR_ENTRY(__list,__nr); \
@@ -167,6 +170,7 @@ static inline void *last_ptr_list(struct ptr_list *list)
  } while (0); \
  } while (0); \
  } \
+ __list->active--; \
  } while ((__list = __list->next) != __head); \
  } \
 } while (0)
@@ -179,6 +183,7 @@ static inline void *last_ptr_list(struct ptr_list *list)
  do { int __nr; \
  __list = __list->prev; \
  __nr = __list->nr; \
+ __list->active++; \
  while (--__nr >= 0) { \
  do { \
  ptr = PTR_ENTRY(__list,__nr); \
@@ -189,6 +194,7 @@ static inline void *last_ptr_list(struct ptr_list *list)
  } while (0); \
  } while (0); \
  } \
+ __list->active--; \
  } while (__list != __head); \
  } \
 } while (0)
@@ -270,6 +276,9 @@ extern void split_ptr_list_head(struct ptr_list *);
  DO_INSERT_CURRENT(new, ptr, __head##ptr, __list##ptr, __nr##ptr)

 #define DO_DELETE_CURRENT(ptr, __head, __list, __nr) do { \
+ if (__list->active > 1) \
+ die("%s:%d delete entry with %d parent using ", \
+       __FILE__, __LINE__, __list->active - 1); \
  void **__this = __list->list + __nr; \
  void **__last = __list->list + __list->nr - 1; \
  while (__this < __last) { \

Comments

Linus Torvalds July 7, 2017, 1:35 a.m. UTC | #1
On Thu, Jul 6, 2017 at 6:18 PM, Christopher Li <sparse@chrisli.org> wrote:
>
> Basiclly, the parent DO_FOR_EACH() has __nr caching the
> current ptr position. When the inner loop function delete
> one entry before the current position. The parent current
> position needs to be adjust, because the entry[] has been
> move forward.

Gaah.

I think the original idea for this was to call pack_ptr_list() only at
the end of all the list traversal that could delete things.

                 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christopher Li July 7, 2017, 5:15 a.m. UTC | #2
On Thu, Jul 6, 2017 at 6:35 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> Gaah.
>
> I think the original idea for this was to call pack_ptr_list() only at
> the end of all the list traversal that could delete things.
>

That is a very good suggestion. The problem would be to find out who
is the last one using that list. Also walking the list to find out the empty
ptr is extra overhead.

I have an idea. We can actually reference count the ptr_list usage like the
above patch. We need to because we want to know who is the lats one using
that ptrlist bucket. The last one using the ptr_list bucket (not the whole list)
can pack this bucket. It will avoid walking the list more than once just to do
the packing. I will try this idea soon. It might work.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luc Van Oostenryck July 7, 2017, 5:44 a.m. UTC | #3
On Thu, Jul 06, 2017 at 06:18:48PM -0700, Christopher Li wrote:
> On Thu, Jul 6, 2017 at 5:40 PM, Christopher Li <sparse@chrisli.org> wrote:
> >
> > Thanks for catching it. That is a very serious bug.
> > The release will be on hold until this bug has been fix.
> > It is actually more than one of them. I will send out a separate patch
> > catching this kind of behavior.
> >
> 
> Here is the patch. There is another offender in simplify.c
> 
> Chris
> 
> From: Christopher Li <sparse@chrisli.org>
> Date: Thu, 6 Jul 2017 16:25:38 -0700
> Subject: [PATCH 2/2] check on delete list entry while parent caller using it.
> 
> This is checking for type the bug Luc reported.
> Basiclly, the parent DO_FOR_EACH() has __nr caching the
> current ptr position. When the inner loop function delete
> one entry before the current position. The parent current
> position needs to be adjust, because the entry[] has been
> move forward.
> 
> This patch only check usage FOR_EACH_XXX macro. There is
> also PREPARE_PTR_LIST haven't cover by this patch.
> It is already catching bugs left and right.
> 
> Most noticablely remove_usage() inside of the  kill_use_list()
> loop.
> ---
>  ptrlist.h | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/ptrlist.h b/ptrlist.h
> index d09be2f..4f38a4e 100644
> --- a/ptrlist.h
> +++ b/ptrlist.h
> @@ -25,7 +25,8 @@
>  #define LIST_NODE_NR (29)
> 
>  struct ptr_list {
> - int nr;
> + unsigned int nr:16;
> + unsigned int active:16;

Mmmm ...

I really don't like this kind of solution.
The ptrlist is already fragile and now it would contains
yet another field that need to be taken care of *and* stay
coherent with other users. I doubt this will help to reduce bugs.

Also, I don't think the principle of this solution is sound.
In particular, these die() make me nervous. 
Can you explain a bit how it is working, how you can guarantee
that these die() won't be called?

-- Luc
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christopher Li July 7, 2017, 6:02 a.m. UTC | #4
On Thu, Jul 6, 2017 at 10:44 PM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
> Mmmm ...
>
> I really don't like this kind of solution.
> The ptrlist is already fragile and now it would contains
> yet another field that need to be taken care of *and* stay
> coherent with other users. I doubt this will help to reduce bugs.

You totally miss the point. This patch is not a final solution.
I am not suggesting submit this patch as it is.

It is mean to expose existing bugs. The reason ptrlist is fragile
is we did not do it properly. Not safe against recursive ptr
loop with inner loop delete stuff from the same list.

> Also, I don't think the principle of this solution is sound.
> In particular, these die() make me nervous.
> Can you explain a bit how it is working, how you can guarantee
> that these die() won't be called?

It wouldn't, every die() means a bug get discovered. We need to fix that.

Go try this patch with a "make check". It has over one hundreds fails.
Those are real bugs. The current big offender is remove_usage() inside
of the  kill_use_list(). It might relate to your code as well. Can you help me
take a look at the offender?

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luc Van Oostenryck July 7, 2017, 6:04 a.m. UTC | #5
On Thu, Jul 06, 2017 at 06:18:48PM -0700, Christopher Li wrote:
> On Thu, Jul 6, 2017 at 5:40 PM, Christopher Li <sparse@chrisli.org> wrote:
> Most noticablely remove_usage() inside of the  kill_use_list()
> loop.

Can you explain a bit what's wrong with this one?

The call chain looks basically like:
   simplify_instruction()			loop over ep->bb & bb->insns)
      kill_use_list()				loop over insn->phi_list
         kill_use() -> remove_use()
            delete_pseudo_user_list_entry	loop over pseudo->users
            kill_instruction()
               kill_use_list()


Note: kill_instruction() doesn't delete the instruction from
      the BB but just mark them as removed which is, I think,
      the right pattern to adopt in general instead of the
      delete/pack.

-- Luc
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christopher Li July 7, 2017, 6:07 a.m. UTC | #6
On Thu, Jul 6, 2017 at 6:35 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>>
>> Basiclly, the parent DO_FOR_EACH() has __nr caching the
>> current ptr position. When the inner loop function delete
>> one entry before the current position. The parent current
>> position needs to be adjust, because the entry[] has been
>> move forward.
>
> Gaah.
>
> I think the original idea for this was to call pack_ptr_list() only at
> the end of all the list traversal that could delete things.

Actually,  the point I am making is different than pack_ptr_list().
I just take a closer look. Pack_ptr_list() only remove empty ptr nodes.
It does not take care my realization that entry[] array sliding forward
causing the parent for loop *missing* ptr entry. Basically the parent
for loop will skip some ptr entry. It is s very subtle and hard to discover.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luc Van Oostenryck July 7, 2017, 6:10 a.m. UTC | #7
On Thu, Jul 06, 2017 at 11:02:24PM -0700, Christopher Li wrote:
> On Thu, Jul 6, 2017 at 10:44 PM, Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> > Mmmm ...
> >
> > I really don't like this kind of solution.
> > The ptrlist is already fragile and now it would contains
> > yet another field that need to be taken care of *and* stay
> > coherent with other users. I doubt this will help to reduce bugs.
> 
> You totally miss the point. This patch is not a final solution.
> I am not suggesting submit this patch as it is.
> 
> It is mean to expose existing bugs.

Ah, OK.

> Go try this patch with a "make check". It has over one hundreds fails.

I'll do, later (just wakeup here).

> Those are real bugs. The current big offender is remove_usage() inside
> of the  kill_use_list(). It might relate to your code as well. Can you help me
> take a look at the offender?

Yes, of course I'll help with this.
And yes, I added a bunch of these remove_use() with the recursive call
to kill_instruction() which I never liked much but solved a bunch of
things.
 
-- Luc
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christopher Li July 7, 2017, 6:18 a.m. UTC | #8
On Thu, Jul 6, 2017 at 11:04 PM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
> On Thu, Jul 06, 2017 at 06:18:48PM -0700, Christopher Li wrote:
>> On Thu, Jul 6, 2017 at 5:40 PM, Christopher Li <sparse@chrisli.org> wrote:
>> Most noticablely remove_usage() inside of the  kill_use_list()
>> loop.
>
> Can you explain a bit what's wrong with this one?

Sure. Sorry I haven't be more specific.
The offending list in question is not the instruction list. It is the
pesudo->user list.

kill_use_list is iterate though p->user.
FOR_EACH_PTR(list, p) {
if (p == VOID)
continue;
kill_use(THIS_ADDRESS(p));
} END_FOR_EACH_PTR(p)


And remove_usage() is deleting the very same list
from with in the loop. That is the bug.

Basically, deleting entry from inner loop while outer loop
is going over the same list has the deleted entry[] sliding
forward problem. Cause the outer loop possible to skip some
entries.

It likely a different bug than the one you discover.
Your crash is likely cause by pack_ptr_list inside the
ptrlist loop. Which cause some pointer point to deleted
node.


I set a break point at die and get the back track as follows:
#0  die (fmt=fmt@entry=0x43e160 "%s:%d delete entry with %d parent
using ") at lib.c:204
#1  0x0000000000420a0f in delete_pseudo_user_list_entry
(list=list@entry=0x7ffff7f2e158, entry=entry@entry=0x7ffff7f72c28,
count=1)
    at simplify.c:175
#2  0x0000000000420d16 in remove_usage (usep=0x7ffff7f72c28,
p=0x7ffff7f2e150) at simplify.c:189
#3  kill_use (usep=0x7ffff7f72c28) at simplify.c:200
#4  kill_use_list (list=0x7ffff7f72c10) at simplify.c:210
#5  0x0000000000420bc9 in kill_insn (insn=insn@entry=0x7ffff7f36450,
force=force@entry=0) at simplify.c:249
#6  0x0000000000422244 in kill_instruction (insn=0x7ffff7f36450) at flow.h:32
#7  clean_up_phi (insn=0x7ffff7f36450) at simplify.c:162
#8  simplify_instruction (insn=insn@entry=0x7ffff7f36450) at simplify.c:1202
#9  0x000000000042015b in clean_up_one_instruction
(insn=0x7ffff7f36450, bb=0x7ffff7f3e5b0) at cse.c:45
#10 clean_up_insns (ep=0x7ffff7f56010) at cse.c:135
#11 cleanup_and_cse (ep=ep@entry=0x7ffff7f56010) at cse.c:366
#12 0x00000000004182e0 in linearize_fn (base_type=<optimized out>,
sym=0x7ffff7f56010) at linearize.c:2244
#13 linearize_symbol (sym=sym@entry=0x7ffff7f69490) at linearize.c:2286
#14 0x0000000000401139 in clean_up_symbols (list=0x7ffff7f6f510) at
test-linearize.c:49
#15 main (argc=<optimized out>, argv=<optimized out>) at test-linearize.c:62

Hope that helps.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christopher Li July 7, 2017, 6:27 a.m. UTC | #9
On Thu, Jul 6, 2017 at 11:10 PM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
>> Go try this patch with a "make check". It has over one hundreds fails.
>
> I'll do, later (just wakeup here).

No problem at all.

Just a head up. I need to crash soon. And tomorrow I will not have much
Internet access at all. I have time to code but no internet. I will see if I
can get a ptrlist safe against delete.

>
>> Those are real bugs. The current big offender is remove_usage() inside
>> of the  kill_use_list(). It might relate to your code as well. Can you help me
>> take a look at the offender?
>
> Yes, of course I'll help with this.
> And yes, I added a bunch of these remove_use() with the recursive call
> to kill_instruction() which I never liked much but solved a bunch of
> things.

Thanks for the help. Yes, that is exactly the bug I am talking about.
Deleting list entry while the parent is still looping on it.
Sparse ptrlist currently has very subtle bug on this situation.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luc Van Oostenryck July 7, 2017, 7:11 a.m. UTC | #10
On Thu, Jul 06, 2017 at 11:18:39PM -0700, Christopher Li wrote:
> On Thu, Jul 6, 2017 at 11:04 PM, Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> > On Thu, Jul 06, 2017 at 06:18:48PM -0700, Christopher Li wrote:
> >> On Thu, Jul 6, 2017 at 5:40 PM, Christopher Li <sparse@chrisli.org> wrote:
> >> Most noticablely remove_usage() inside of the  kill_use_list()
> >> loop.
> >
> > Can you explain a bit what's wrong with this one?
> 
> Sure. Sorry I haven't be more specific.
> The offending list in question is not the instruction list. It is the
> pesudo->user list.
> 
> kill_use_list is iterate though p->user.
> FOR_EACH_PTR(list, p) {
> if (p == VOID)
> continue;
> kill_use(THIS_ADDRESS(p));
> } END_FOR_EACH_PTR(p)
> 
> 
> And remove_usage() is deleting the very same list
> from with in the loop. That is the bug.

Strange. kill_use_list() is only iterated via insn->phi_list
or insn->arguments, not p->user.

But yes, something is surely messing with the lists here.

> It likely a different bug than the one you discover.
> Your crash is likely cause by pack_ptr_list inside the
> ptrlist loop. Which cause some pointer point to deleted
> node.

Yes, it's a different one.
Mine wasn't through pack_ptr_list() but directly in the macros
doing the list walking of ep->bbs.

Anyway, look at all this later.

-- Luc
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luc Van Oostenryck July 7, 2017, 7:30 a.m. UTC | #11
On Thu, Jul 06, 2017 at 11:27:58PM -0700, Christopher Li wrote:
> On Thu, Jul 6, 2017 at 11:10 PM, Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> >
> >> Go try this patch with a "make check". It has over one hundreds fails.
> >
> > I'll do, later (just wakeup here).
> 
> No problem at all.
> 
> Just a head up. I need to crash soon. And tomorrow I will not have much
> Internet access at all. I have time to code but no internet. I will see if I
> can get a ptrlist safe against delete.

Sure, no problem.
 
> Sparse ptrlist currently has very subtle bug on this situation.

Yes, it's not the first time we have been bitten by them.

The real problem, in my opinion, is:
- The ptr_list macros & functions have rules about their usage but
  * they are not explained at all
  * they is no way to enforce them or detect a violation
  * detection a violation will probably have a cost higher than
    we'll be willing to pay.
  * it's hard to not violate these rules because list walking is
    ubiquitous.
- Problems can only occurs when deleting an element from the list.
  Pure walking of the list, adding an element or modifying one
  is safe. Even modifying the pointer itself, like, for example,
  setting it to NULL, is safe (hint, hint!).

-- Luc
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christopher Li July 7, 2017, 8:25 a.m. UTC | #12
On Fri, Jul 7, 2017 at 12:11 AM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>> And remove_usage() is deleting the very same list
>> from with in the loop. That is the bug.
>
> Strange. kill_use_list() is only iterated via insn->phi_list
> or insn->arguments, not p->user.
>
> But yes, something is surely messing with the lists here.
>

I figure it out. It is a false alarm on the checking side.
The condition I want to check can still cause a bug,
but this report is not one of those conditions.

It is cause by this code:

static int dead_insn(struct instruction *insn, pseudo_t *src1,
pseudo_t *src2, pseudo_t *src3)
{
struct pseudo_user *pu;
FOR_EACH_PTR(insn->target->users, pu) {
if (*pu->userp != VOID)
return 0;
} END_FOR_EACH_PTR(pu);

So the return terminate the execution flow before
reaching to the  END_FOR_EACH_PTR(pu).
The ptr->active still think we are in the loop
But we are not.

It is a bug in the checking side. I still think this kind
of checking is useful, but need some special handle
of bail out of the ptr_list loop.

Very sorry about that.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luc Van Oostenryck July 7, 2017, 8:28 a.m. UTC | #13
On Thu, Jul 06, 2017 at 10:15:02PM -0700, Christopher Li wrote:
> On Thu, Jul 6, 2017 at 6:35 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > Gaah.
> >
> > I think the original idea for this was to call pack_ptr_list() only at
> > the end of all the list traversal that could delete things.
> >
> 
> That is a very good suggestion. The problem would be to find out who
> is the last one using that list. Also walking the list to find out the empty
> ptr is extra overhead.
> 
> I have an idea. We can actually reference count the ptr_list usage like the
> above patch. We need to because we want to know who is the lats one using
> that ptrlist bucket. The last one using the ptr_list bucket (not the whole list)
> can pack this bucket. It will avoid walking the list more than once just to do
> the packing. I will try this idea soon. It might work.

This reference count is a property of the whole list and would thus
need to be stored in the head of the list, which doesn't really exist.

Also, I'm not convinced that the problem can only caused by the list
packing in the inner loop (looking at the code for the FOR_EACH, it
looks as it should be OK, but if you enter in the equation, reverse
walking and/or list splitting, I have doubts).

Another point to take in account is that the list packing was needed
because the list walking didn't like empty blocks. But months ago,
because of another bugs we made all the list walking robust against
empty blocks, so list packing shouldn't be needed anymore for
correctness. So we could in general avoid any list packing,
and only do it at some specific safe points just to save memory &
CPU time when walking lists with several blocks but few elements
(and we need to keep in mind that most lists are very very short).

I'm more and more inclined to:
1) in general, avoid altogether deletion from lists in favor
   marking elements as being deleted, either like we do for
   instructions (setting insn->bb to NULL) or storing a special
   value in the ptrlist.
2) at some specific points, as an optimization, doing a sort of
   GC of the list: removing elements marked as deleted and/or
   doing the list packing.
That sound to me as simple, generic and robust.

-- Luc
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luc Van Oostenryck July 7, 2017, 8:32 a.m. UTC | #14
On Fri, Jul 07, 2017 at 01:25:14AM -0700, Christopher Li wrote:
> On Fri, Jul 7, 2017 at 12:11 AM, Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> >> And remove_usage() is deleting the very same list
> >> from with in the loop. That is the bug.
> >
> > Strange. kill_use_list() is only iterated via insn->phi_list
> > or insn->arguments, not p->user.
> >
> > But yes, something is surely messing with the lists here.
> >
> 
> I figure it out. It is a false alarm on the checking side.
> The condition I want to check can still cause a bug,
> but this report is not one of those conditions.
> 
> It is cause by this code:
> 
> static int dead_insn(struct instruction *insn, pseudo_t *src1,
> pseudo_t *src2, pseudo_t *src3)
> {
> struct pseudo_user *pu;
> FOR_EACH_PTR(insn->target->users, pu) {
> if (*pu->userp != VOID)
> return 0;
> } END_FOR_EACH_PTR(pu);
> 
> So the return terminate the execution flow before
> reaching to the  END_FOR_EACH_PTR(pu).
> The ptr->active still think we are in the loop
> But we are not.

OK, I see.
 
> It is a bug in the checking side. I still think this kind
> of checking is useful, but need some special handle
> of bail out of the ptr_list loop.

Yes, some kind of optional integrity checking or the kind of
checks you're doing here should be very usefull.

> Very sorry about that.

No problem, of course. 

-- Luc
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christopher Li July 7, 2017, 9:06 a.m. UTC | #15
On Fri, Jul 7, 2017 at 1:28 AM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
> This reference count is a property of the whole list and would thus
> need to be stored in the head of the list, which doesn't really exist.

Obviously I am too tired to think straight right now. But I  think my
idea of reference counting on the list node, not the while list might
actually work.  Because, if the inner loop delete an entry of a different
node than the outer loop was currently holding on, it is actually fine.
Why? Because by the time the outer loop advance to the that the
deleted node, it will reload the list->nr from memory.
I can be wrong through. I should be able to find out soon enough.

> Also, I'm not convinced that the problem can only caused by the list
> packing in the inner loop (looking at the code for the FOR_EACH, it
> looks as it should be OK, but if you enter in the equation, reverse
> walking and/or list splitting, I have doubts).

Like I said, if you delete entry while the outer loop holding the
same ptr node, it is a bug even without packing the list in inner
loop.

I think ref counting the node should be fine with reversing. List
spiting I have to. Will find out.

>
> Another point to take in account is that the list packing was needed
> because the list walking didn't like empty blocks. But months ago,
> because of another bugs we made all the list walking robust against
> empty blocks, so list packing shouldn't be needed anymore for
> correctness. So we could in general avoid any list packing,
> and only do it at some specific safe points just to save memory &
> CPU time when walking lists with several blocks but few elements
> (and we need to keep in mind that most lists are very very short).

I still think if ref counting node properly, it will solve the bug I describe
and possible the list packing as well. If nobody else is holding that
node, it should be fine to pack and delete it. It is not thread safe but
that is a different issue.

>
> I'm more and more inclined to:
> 1) in general, avoid altogether deletion from lists in favor
>    marking elements as being deleted, either like we do for
>    instructions (setting insn->bb to NULL) or storing a special
>    value in the ptrlist.

My idea of the ref count will do exactly that when the inner
loop try to delete an entry but outer loop holding the node as well.
It will increase the deleted count and replace the ptr entry to
NULL.

When the ref count of the node drop to zero and node has delete count.
Pack the entry[] array, remove NULL entry and even possible delete
the node as well.

Let me know if you see any problem with that approach.

> 2) at some specific points, as an optimization, doing a sort of
>    GC of the list: removing elements marked as deleted and/or
>    doing the list packing.

If the ref count node works, then just pack the node when
ref count drop to zero (or one with some care).

One idea is that we can use sparse to check itself
if there is any sparse code return in a loop without dec the ref count.

> That sound to me as simple, generic and robust.

Yes. So our idea is actually very similar. The only difference
is I use ref count as means of GC, so the ptrlist always knows
when it is safe to delete stuff.

I will find out more.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dibyendu Majumdar July 7, 2017, 9:19 a.m. UTC | #16
Hi,

On 7 July 2017 at 07:02, Christopher Li <sparse@chrisli.org> wrote:
> On Thu, Jul 6, 2017 at 10:44 PM, Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
>> The ptrlist is already fragile and now it would contains
>> yet another field that need to be taken care of *and* stay
>> coherent with other users. I doubt this will help to reduce bugs.
>>
> The reason ptrlist is fragile
> is we did not do it properly. Not safe against recursive ptr
> loop with inner loop delete stuff from the same list.
>

One of the first things I encountered when trying to get Sparse to
compile with MSVC was ptrlist as it uses some gcc only features. I
found it quite hard to follow the Macros so I attempted to reduce the
use of macros, and introduced the concept of iterators. So the macros
I use now look like below. I think this approach is easier to
understand and maintain. If you would like a patch for similar
implementation for Sparse then please let me know.

In general though - while a list is being traversed, deleting items
should be expected to work, as long as a ptrlist node is not deleted.
The reason is that an iterator is presumably on a particular node -
and if unknown to it the iterator is deleted, then it has no way to
continue. So maybe calling pack in the middle of an operation is
causing the problem?

Anyway - here is a snippet from my modified ptrlist implementation.

/* Each node in the list */
struct ptr_list {
int nr_;
struct ptr_list *prev_;
struct ptr_list *next_;
struct allocator *allocator_;
void *list_[LIST_NODE_NR];
};

struct ptr_list_iter {
struct ptr_list *__head;
struct ptr_list *__list;
int __nr;
};


#define FOR_EACH_PTR(list, var) \
{ struct ptr_list_iter var##iter__ = ptrlist_forward_iterator(list); \
for (var = ptrlist_iter_next(&var##iter__); var != NULL; var =
ptrlist_iter_next(&var##iter__))
#define FOR_EACH_PTR_TYPED(list, type, var) \
{ struct ptr_list_iter var##iter__ = ptrlist_forward_iterator(list); \
for (var = (type) ptrlist_iter_next(&var##iter__); var != NULL; var =
(type) ptrlist_iter_next(&var##iter__))
#define FOR_EACH_PTR_TYPE(list, var, ptr_type) \
{ struct ptr_list_iter var##iter__ = ptrlist_forward_iterator(list); \
for (var = (ptr_type) ptrlist_iter_next(&var##iter__); var != NULL;
var = (ptr_type) ptrlist_iter_next(&var##iter__))
#define END_FOR_EACH_PTR(var) }

#define FOR_EACH_PTR_REVERSE(list, var) \
{ struct ptr_list_iter var##iter__ = ptrlist_reverse_iterator(list); \
for (var = ptrlist_iter_prev(&var##iter__); var != NULL; var =
ptrlist_iter_prev(&var##iter__))
#define END_FOR_EACH_PTR_REVERSE(var) }

#define RECURSE_PTR_REVERSE(list, var) \
{ struct ptr_list_iter var##iter__ = list##iter__; \
for (var = ptrlist_iter_prev(&var##iter__); var != NULL; var =
ptrlist_iter_prev(&var##iter__))

#define PREPARE_PTR_LIST(list, var) \
struct ptr_list_iter var##iter__ = ptrlist_forward_iterator(list); \
var = ptrlist_iter_next(&var##iter__)

#define NEXT_PTR_LIST(var) \
var = ptrlist_iter_next(&var##iter__)
#define FINISH_PTR_LIST(var)

#define THIS_ADDRESS(type, var) \
(type *)ptrlist_iter_this_address(&var##iter__)

#define DELETE_CURRENT_PTR(var) \
ptrlist_iter_remove(&var##iter__)

#define REPLACE_CURRENT_PTR(type, var, replacement) \
ptrlist_iter_set(&var##iter__, replacement)

#define INSERT_CURRENT(newval, var) \
ptrlist_iter_insert(&var##iter__, newval)
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dibyendu Majumdar July 7, 2017, 9:26 a.m. UTC | #17
Apologies for the typo - correction below.

On 7 July 2017 at 10:19, Dibyendu Majumdar <mobile@majumdar.org.uk> wrote:
> Hi,
>
> On 7 July 2017 at 07:02, Christopher Li <sparse@chrisli.org> wrote:
>> On Thu, Jul 6, 2017 at 10:44 PM, Luc Van Oostenryck
>> <luc.vanoostenryck@gmail.com> wrote:
>>> The ptrlist is already fragile and now it would contains
>>> yet another field that need to be taken care of *and* stay
>>> coherent with other users. I doubt this will help to reduce bugs.
>>>
>> The reason ptrlist is fragile
>> is we did not do it properly. Not safe against recursive ptr
>> loop with inner loop delete stuff from the same list.
>>
>

In general though - while a list is being traversed, deleting items
should be expected to work, as long as a ptrlist node is not deleted.
The reason is that an iterator is presumably on a particular node -
and if unknown to if the node it is on is deleted, then the iterator
has no way to
continue. So maybe calling pack in the middle of an operation is
causing the problem?

I think the current implementation (and my version too) is safe as
long as nodes are not deleted while a traversal is going on?

Thanks and Regards
Dibyendu
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luc Van Oostenryck July 7, 2017, 9:30 a.m. UTC | #18
On Fri, Jul 07, 2017 at 02:06:14AM -0700, Christopher Li wrote:
> On Fri, Jul 7, 2017 at 1:28 AM, Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> >
> > This reference count is a property of the whole list and would thus
> > need to be stored in the head of the list, which doesn't really exist.
> 
> Obviously I am too tired to think straight right now. But I  think my
> idea of reference counting on the list node, not the while list might
> actually work.  Because, if the inner loop delete an entry of a different
> node than the outer loop was currently holding on, it is actually fine.
> Why? Because by the time the outer loop advance to the that the
> deleted node, it will reload the list->nr from memory.
> I can be wrong through. I should be able to find out soon enough.
> 
> > Also, I'm not convinced that the problem can only caused by the list
> > packing in the inner loop (looking at the code for the FOR_EACH, it
> > looks as it should be OK, but if you enter in the equation, reverse
> > walking and/or list splitting, I have doubts).
> 
> Like I said, if you delete entry while the outer loop holding the
> same ptr node, it is a bug even without packing the list in inner
> loop.

Yup.
 
> I think ref counting the node should be fine with reversing. List
> spiting I have to. Will find out.
> 
> >
> > Another point to take in account is that the list packing was needed
> > because the list walking didn't like empty blocks. But months ago,
> > because of another bugs we made all the list walking robust against
> > empty blocks, so list packing shouldn't be needed anymore for
> > correctness. So we could in general avoid any list packing,
> > and only do it at some specific safe points just to save memory &
> > CPU time when walking lists with several blocks but few elements
> > (and we need to keep in mind that most lists are very very short).
> 
> I still think if ref counting node properly, it will solve the bug I describe
> and possible the list packing as well. If nobody else is holding that
> node, it should be fine to pack and delete it.

Yes, it's very possible that it will work properly, but at which price?
You will need to add some complexity to make it work, is it worth?

> It is not thread safe but
> that is a different issue.

That's really the last of my worries, sparse is not designed to be threaded
and I don't see why it should.
 
> >
> > I'm more and more inclined to:
> > 1) in general, avoid altogether deletion from lists in favor
> >    marking elements as being deleted, either like we do for
> >    instructions (setting insn->bb to NULL) or storing a special
> >    value in the ptrlist.
> 
> My idea of the ref count will do exactly that when the inner
> loop try to delete an entry but outer loop holding the node as well.
> It will increase the deleted count and replace the ptr entry to
> NULL.
> 
> When the ref count of the node drop to zero and node has delete count.
> Pack the entry[] array, remove NULL entry and even possible delete
> the node as well.
> 
> Let me know if you see any problem with that approach.

Except for the added complexity of having to maintain the ref count
for, IMO, no good enough reasons, the problem is that some lists
contains NULL pointers as legitimate values (I don't remember which
parts do but I had the case when I added the protection for the empty
blocks I also completly rewrote all the ptrlist layer with nice clean
iterator structures and a set of functions around it and those functions
naturally returned NULL to meant the end-of-list. First testing went
quite well but then showed some bugs casued by the fact that some
of the lists contained NULL as legitimate value).
It's why I said here above "marking as deleted or storing a special
value". Of course, maybe we can more easily avoid the few special cases
where NULLs are stored in ptrlist.

> > 2) at some specific points, as an optimization, doing a sort of
> >    GC of the list: removing elements marked as deleted and/or
> >    doing the list packing.
> 
> If the ref count node works, then just pack the node when
> ref count drop to zero (or one with some care).
> 
> One idea is that we can use sparse to check itself
> if there is any sparse code return in a loop without dec the ref count.
> 
> > That sound to me as simple, generic and robust.
> 
> Yes. So our idea is actually very similar. The only difference
> is I use ref count as means of GC, so the ptrlist always knows
> when it is safe to delete stuff.
> 
> I will find out more.


-- Luc
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luc Van Oostenryck July 7, 2017, 9:38 a.m. UTC | #19
On Fri, Jul 07, 2017 at 10:26:38AM +0100, Dibyendu Majumdar wrote:
> In general though - while a list is being traversed, deleting items
> should be expected to work, as long as a ptrlist node is not deleted.
> The reason is that an iterator is presumably on a particular node -
> and if unknown to if the node it is on is deleted, then the iterator
> has no way to
> continue. So maybe calling pack in the middle of an operation is
> causing the problem?
> 
> I think the current implementation (and my version too) is safe as
> long as nodes are not deleted while a traversal is going on?

Yes, but the problem is that we do delete nodes while another traversal
is going on. And this can be very indirect.

-- Luc 
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dibyendu Majumdar July 7, 2017, 9:41 a.m. UTC | #20
On 7 July 2017 at 10:38, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote:
> On Fri, Jul 07, 2017 at 10:26:38AM +0100, Dibyendu Majumdar wrote:
>> In general though - while a list is being traversed, deleting items
>> should be expected to work, as long as a ptrlist node is not deleted.
>> The reason is that an iterator is presumably on a particular node -
>> and if unknown to if the node it is on is deleted, then the iterator
>> has no way to
>> continue. So maybe calling pack in the middle of an operation is
>> causing the problem?
>>
>> I think the current implementation (and my version too) is safe as
>> long as nodes are not deleted while a traversal is going on?
>
> Yes, but the problem is that we do delete nodes while another traversal
> is going on. And this can be very indirect.
>

Well I think it is best not to do that really. Why do the nodes need
to be deleted? There is no harm in retaining them surely as long as
the the iterators can handle it?

Regards
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christopher Li July 7, 2017, 9:44 a.m. UTC | #21
On Fri, Jul 7, 2017 at 2:19 AM, Dibyendu Majumdar
<mobile@majumdar.org.uk> wrote:
> Hi,
>
> On 7 July 2017 at 07:02, Christopher Li <sparse@chrisli.org> wrote:
>
> Anyway - here is a snippet from my modified ptrlist implementation.
>
> /* Each node in the list */
> struct ptr_list {
> int nr_;
> struct ptr_list *prev_;
> struct ptr_list *next_;
> struct allocator *allocator_;
> void *list_[LIST_NODE_NR];
> };
>
> struct ptr_list_iter {
> struct ptr_list *__head;
> struct ptr_list *__list;
> int __nr;
> };

I try exactly that before.

The problem with that is, it generate horrible code.
I never able to teach gcc the member of the list_iter
does not need to sync at memory boundary. I really
want it to be register like.




>
>
> #define FOR_EACH_PTR(list, var) \
> { struct ptr_list_iter var##iter__ = ptrlist_forward_iterator(list); \
> for (var = ptrlist_iter_next(&var##iter__); var != NULL; var =
> ptrlist_iter_next(&var##iter__))
> #define FOR_EACH_PTR_TYPED(list, type, var) \
> { struct ptr_list_iter var##iter__ = ptrlist_forward_iterator(list); \
> for (var = (type) ptrlist_iter_next(&var##iter__); var != NULL; var =
> (type) ptrlist_iter_next(&var##iter__))
> #define FOR_EACH_PTR_TYPE(list, var, ptr_type) \
> { struct ptr_list_iter var##iter__ = ptrlist_forward_iterator(list); \
> for (var = (ptr_type) ptrlist_iter_next(&var##iter__); var != NULL;
> var = (ptr_type) ptrlist_iter_next(&var##iter__))
> #define END_FOR_EACH_PTR(var) }
>
> #define FOR_EACH_PTR_REVERSE(list, var) \
> { struct ptr_list_iter var##iter__ = ptrlist_reverse_iterator(list); \
> for (var = ptrlist_iter_prev(&var##iter__); var != NULL; var =
> ptrlist_iter_prev(&var##iter__))
> #define END_FOR_EACH_PTR_REVERSE(var) }

The problem with this is that, once you put delete into mix.
The iter will need to know which direction it need to go after
the delete. It is actually very messy.

 I end up abandon that approach.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dibyendu Majumdar July 7, 2017, 9:46 a.m. UTC | #22
On 7 July 2017 at 10:44, Christopher Li <sparse@chrisli.org> wrote:
> On Fri, Jul 7, 2017 at 2:19 AM, Dibyendu Majumdar
> <mobile@majumdar.org.uk> wrote:
>> On 7 July 2017 at 07:02, Christopher Li <sparse@chrisli.org> wrote:
>>
>> Anyway - here is a snippet from my modified ptrlist implementation.
>>
>> /* Each node in the list */
>> struct ptr_list {
>> int nr_;
>> struct ptr_list *prev_;
>> struct ptr_list *next_;
>> struct allocator *allocator_;
>> void *list_[LIST_NODE_NR];
>> };
>>
>> struct ptr_list_iter {
>> struct ptr_list *__head;
>> struct ptr_list *__list;
>> int __nr;
>> };
>
> I try exactly that before.
>
> The problem with that is, it generate horrible code.
> I never able to teach gcc the member of the list_iter
> does not need to sync at memory boundary. I really
> want it to be register like.
>
>>
>> #define FOR_EACH_PTR(list, var) \
>> { struct ptr_list_iter var##iter__ = ptrlist_forward_iterator(list); \
>> for (var = ptrlist_iter_next(&var##iter__); var != NULL; var =
>> ptrlist_iter_next(&var##iter__))
>> #define FOR_EACH_PTR_TYPED(list, type, var) \
>> { struct ptr_list_iter var##iter__ = ptrlist_forward_iterator(list); \
>> for (var = (type) ptrlist_iter_next(&var##iter__); var != NULL; var =
>> (type) ptrlist_iter_next(&var##iter__))
>> #define FOR_EACH_PTR_TYPE(list, var, ptr_type) \
>> { struct ptr_list_iter var##iter__ = ptrlist_forward_iterator(list); \
>> for (var = (ptr_type) ptrlist_iter_next(&var##iter__); var != NULL;
>> var = (ptr_type) ptrlist_iter_next(&var##iter__))
>> #define END_FOR_EACH_PTR(var) }
>>
>> #define FOR_EACH_PTR_REVERSE(list, var) \
>> { struct ptr_list_iter var##iter__ = ptrlist_reverse_iterator(list); \
>> for (var = ptrlist_iter_prev(&var##iter__); var != NULL; var =
>> ptrlist_iter_prev(&var##iter__))
>> #define END_FOR_EACH_PTR_REVERSE(var) }
>
> The problem with this is that, once you put delete into mix.
> The iter will need to know which direction it need to go after
> the delete. It is actually very messy.
>
>  I end up abandon that approach.

I am using this without problems in dmr_C so not sure I understand why
it did not work.

Regards
Dibyendu
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luc Van Oostenryck July 7, 2017, 9:52 a.m. UTC | #23
On Fri, Jul 07, 2017 at 10:28:03AM +0200, Luc Van Oostenryck wrote:
> Another point to take in account is that the list packing was needed
> because the list walking didn't like empty blocks. But months ago,
> because of another bugs we made all the list walking robust against
> empty blocks, so list packing shouldn't be needed anymore for
> correctness. So we could in general avoid any list packing,
> and only do it at some specific safe points just to save memory &
> CPU time when walking lists with several blocks but few elements
> (and we need to keep in mind that most lists are very very short).
> 
> I'm more and more inclined to:
> 1) in general, avoid altogether deletion from lists in favor
>    marking elements as being deleted, either like we do for
>    instructions (setting insn->bb to NULL) or storing a special
>    value in the ptrlist.
> 2) at some specific points, as an optimization, doing a sort of
>    GC of the list: removing elements marked as deleted and/or
>    doing the list packing.
> That sound to me as simple, generic and robust.

Another thing that could help is that, in fact, for most lists/
list types we can't have nested list walking.

In practice, BB & insn lists will be the ones really concerned
and insns are immune to the problem since they are never deleted
but simply marked as deleted.

And this is only because the cleanup code is all done inside the
ep->bbs & bb->insns loops. So at the end, I wouldn't be surprised
that only the BBs are impacted, which is coherent to what the
testing showed.

But yes, we need to have guarantees.

-- Luc
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christopher Li July 7, 2017, 9:54 a.m. UTC | #24
On Fri, Jul 7, 2017 at 2:30 AM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
> Yes, it's very possible that it will work properly, but at which price?
> You will need to add some complexity to make it work, is it worth?

We will see. I think it is clean enough I will take a try.
The price is my time and other's who will need to review it.



> Except for the added complexity of having to maintain the ref count
> for, IMO, no good enough reasons, the problem is that some lists
> contains NULL pointers as legitimate values (I don't remember which

I guess we will find out. It is not hard to assert of NULL add to the
ptrlist. I do have ways to allow any value in entry but add more complexity.
Doing the correct thing is first priority.

If we are talking about the run time cost, I double it will make much of
the difference. We can benchmark for it.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christopher Li July 7, 2017, 9:58 a.m. UTC | #25
On Fri, Jul 7, 2017 at 2:41 AM, Dibyendu Majumdar
<mobile@majumdar.org.uk> wrote:
>
> Well I think it is best not to do that really. Why do the nodes need
> to be deleted? There is no harm in retaining them surely as long as
> the the iterators can handle it?
>
Please know that, even the node is not deleted. Just remove the
entry from list->list[] will cause problem on the outer loop on the same
list, if there is one.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luc Van Oostenryck July 7, 2017, 10 a.m. UTC | #26
On Fri, Jul 7, 2017 at 11:46 AM, Dibyendu Majumdar
<mobile@majumdar.org.uk> wrote:
> I am using this without problems in dmr_C so not sure I understand why
> it did not work.

It may be related to the amount of testing.
The problem we have here is not exactly frequent, on the contrary.
I have tested sparse in all sort of conditions, during hours with a lot of
code, normal and less normal, without any problems.
It's only after I did some fuzzing that I got these, and I only had a few tens
of crashes after more than 150 hours of fuzzing and more than 600GB
of generated code (which was then sometimes purposely corrupted).

-- Luc
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dibyendu Majumdar July 7, 2017, 10:08 a.m. UTC | #27
On 7 July 2017 at 10:58, Christopher Li <sparse@chrisli.org> wrote:
> On Fri, Jul 7, 2017 at 2:41 AM, Dibyendu Majumdar
> <mobile@majumdar.org.uk> wrote:
>>
>> Well I think it is best not to do that really. Why do the nodes need
>> to be deleted? There is no harm in retaining them surely as long as
>> the the iterators can handle it?
>>
> Please know that, even the node is not deleted. Just remove the
> entry from list->list[] will cause problem on the outer loop on the same
> list, if there is one.
>

I will test this in my version but I believe that deleting entries in
list->list[] is okay. Every time the next operation on the iterator is
called, it checks whether its index in the node is still valid. As
long as nodes are not deleted then it should work - but I need to
prove it via a test case.

Here is my forward iterator next implementation:

void *ptrlist_iter_next(struct ptr_list_iter *self) {
  if (self->__head == NULL)
    return NULL;
  self->__nr++;
Lretry:
  if (self->__nr < self->__list->nr_) {
    return self->__list->list_[self->__nr];
  } else if (self->__list->next_ != self->__head) {
    self->__list = self->__list->next_;
    self->__nr = 0;
    goto Lretry;
  }
  return NULL;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christopher Li July 7, 2017, 12:54 p.m. UTC | #28
On Fri, Jul 7, 2017 at 3:08 AM, Dibyendu Majumdar
<mobile@majumdar.org.uk> wrote:
>
> I will test this in my version but I believe that deleting entries in
> list->list[] is okay. Every time the next operation on the iterator is
> called, it checks whether its index in the node is still valid. As
> long as nodes are not deleted then it should work - but I need to
> prove it via a test case.

As far as I can tell, your code will have the same bug.  If
same list used in the outer loop and inner loop and inner
loop delete an entry.


>
> Here is my forward iterator next implementation:
>
> void *ptrlist_iter_next(struct ptr_list_iter *self) {
>   if (self->__head == NULL)
>     return NULL;
>   self->__nr++;
> Lretry:
>   if (self->__nr < self->__list->nr_) {
>     return self->__list->list_[self->__nr];
>   } else if (self->__list->next_ != self->__head) {
>     self->__list = self->__list->next_;
>     self->__nr = 0;
>     goto Lretry;
>   }

Let say __list has 8 entry. nr = 8.  self->__nr = 3.
In the inner loop delete first 2 entry. Then the list[2:8]
will shift to list[0:6]. Now your self->__nr = 3 will
skip the next 2 entry because they move to list[1:3].

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dibyendu Majumdar July 7, 2017, 1:01 p.m. UTC | #29
On 7 July 2017 at 13:54, Christopher Li <sparse@chrisli.org> wrote:
> On Fri, Jul 7, 2017 at 3:08 AM, Dibyendu Majumdar
> <mobile@majumdar.org.uk> wrote:
>>
>> I will test this in my version but I believe that deleting entries in
>> list->list[] is okay. Every time the next operation on the iterator is
>> called, it checks whether its index in the node is still valid. As
>> long as nodes are not deleted then it should work - but I need to
>> prove it via a test case.
>
> As far as I can tell, your code will have the same bug.  If
> same list used in the outer loop and inner loop and inner
> loop delete an entry.
>
>
>>
>> Here is my forward iterator next implementation:
>>
>> void *ptrlist_iter_next(struct ptr_list_iter *self) {
>>   if (self->__head == NULL)
>>     return NULL;
>>   self->__nr++;
>> Lretry:
>>   if (self->__nr < self->__list->nr_) {
>>     return self->__list->list_[self->__nr];
>>   } else if (self->__list->next_ != self->__head) {
>>     self->__list = self->__list->next_;
>>     self->__nr = 0;
>>     goto Lretry;
>>   }
>
> Let say __list has 8 entry. nr = 8.  self->__nr = 3.
> In the inner loop delete first 2 entry. Then the list[2:8]
> will shift to list[0:6]. Now your self->__nr = 3 will
> skip the next 2 entry because they move to list[1:3].
>

Yes okay. So that means the inner loop cannot modify the list, as even
adding an element will alter the order.

Presumably this issue occurs when the same list is traversed
recursively and the inner loop modifies the list. Was this usage
always there is Sparse or is it a recent change?

Regards
Dibyendu
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dibyendu Majumdar July 7, 2017, 1:18 p.m. UTC | #30
Hi Chris,

On 7 July 2017 at 10:06, Christopher Li <sparse@chrisli.org> wrote:
> On Fri, Jul 7, 2017 at 1:28 AM, Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
>>
> My idea of the ref count will do exactly that when the inner
> loop try to delete an entry but outer loop holding the node as well.
> It will increase the deleted count and replace the ptr entry to
> NULL.
>
> When the ref count of the node drop to zero and node has delete count.
> Pack the entry[] array, remove NULL entry and even possible delete
> the node as well.
>
> Let me know if you see any problem with that approach.
>

How will the insertion scenario be handled - or even splits caused by
insertion? These would also invalidate the order of the entries in a
ptr list node, right?

I think that maybe an alternative approach is to use a lock, and
ensure that the ptr list node is locked while it is being iterated.
Only lock holder can modify. Any inner loop will fail if it tries to
modify the node. The lock can be a pointer - so if NULL unlocked, else
locked. Using a pointer as lock allows concept of ownership.

But anyway this is only to catch scenarios where the list is being
changed by an inner loop.

Regards
Dibyendu
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luc Van Oostenryck July 7, 2017, 1:25 p.m. UTC | #31
On Fri, Jul 7, 2017 at 3:18 PM, Dibyendu Majumdar
<mobile@majumdar.org.uk> wrote:
> Hi Chris,
>
> On 7 July 2017 at 10:06, Christopher Li <sparse@chrisli.org> wrote:
>> On Fri, Jul 7, 2017 at 1:28 AM, Luc Van Oostenryck
>> <luc.vanoostenryck@gmail.com> wrote:
>>>
>> My idea of the ref count will do exactly that when the inner
>> loop try to delete an entry but outer loop holding the node as well.
>> It will increase the deleted count and replace the ptr entry to
>> NULL.
>>
>> When the ref count of the node drop to zero and node has delete count.
>> Pack the entry[] array, remove NULL entry and even possible delete
>> the node as well.
>>
>> Let me know if you see any problem with that approach.
>>
>
> How will the insertion scenario be handled - or even splits caused by
> insertion? These would also invalidate the order of the entries in a
> ptr list node, right?
>
> I think that maybe an alternative approach is to use a lock, and
> ensure that the ptr list node is locked while it is being iterated.

Don't forget that it's not multithreaded code we're talking here:
a lock, in itself, won't help since there is no concurrent access.

Otherwise, it's quite similar to Chris' idea of using a refcount.
But then, we 'just' have to correctly maintain this refcount.

-- Luc
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dibyendu Majumdar July 7, 2017, 1:29 p.m. UTC | #32
Hi Luc,

On 7 July 2017 at 14:25, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote:
> On Fri, Jul 7, 2017 at 3:18 PM, Dibyendu Majumdar
> <mobile@majumdar.org.uk> wrote:
>> On 7 July 2017 at 10:06, Christopher Li <sparse@chrisli.org> wrote:
>>> On Fri, Jul 7, 2017 at 1:28 AM, Luc Van Oostenryck
>>> <luc.vanoostenryck@gmail.com> wrote:
>>>>
>>> My idea of the ref count will do exactly that when the inner
>>> loop try to delete an entry but outer loop holding the node as well.
>>> It will increase the deleted count and replace the ptr entry to
>>> NULL.
>>>
>>> When the ref count of the node drop to zero and node has delete count.
>>> Pack the entry[] array, remove NULL entry and even possible delete
>>> the node as well.
>>>
>>> Let me know if you see any problem with that approach.
>>>
>>
>> How will the insertion scenario be handled - or even splits caused by
>> insertion? These would also invalidate the order of the entries in a
>> ptr list node, right?
>>
>> I think that maybe an alternative approach is to use a lock, and
>> ensure that the ptr list node is locked while it is being iterated.
>
> Don't forget that it's not multithreaded code we're talking here:
> a lock, in itself, won't help since there is no concurrent access.

Yes realize that - so the lock here is only a concept. It is not a
mutex or anything like that.

>
> Otherwise, it's quite similar to Chris' idea of using a refcount.
> But then, we 'just' have to correctly maintain this refcount.
>

Agree that if refcount is used as a lock to prevent inner loops from
amending the list then it is the same. But I think Chris is trying to
do more - i.e. allow an inner loop to amend the list. That might
simply not be possible to handle safely.

Regards
Dibyendu
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luc Van Oostenryck July 7, 2017, 1:47 p.m. UTC | #33
On Fri, Jul 07, 2017 at 02:29:56PM +0100, Dibyendu Majumdar wrote:
> Agree that if refcount is used as a lock to prevent inner loops from
> amending the list then it is the same. But I think Chris is trying to
> do more - i.e. allow an inner loop to amend the list. That might
> simply not be possible to handle safely.

We have to do something more (or something less :)) otherwise
the refcount can only be used to detect a problem. In other words,
having a situation like "OK, I can't safely do XYZ here" is maybe
better than the current situation but doesn't help when you need
to do XYZ.

-- Luc
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christopher Li July 8, 2017, 3:43 p.m. UTC | #34
On Fri, Jul 7, 2017 at 6:47 AM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
> We have to do something more (or something less :)) otherwise
> the refcount can only be used to detect a problem. In other words,
> having a situation like "OK, I can't safely do XYZ here" is maybe
> better than the current situation but doesn't help when you need
> to do XYZ.

Right. The real implementation will replace the die with some code to
handle those situation. I am also considering a debug version of
sparse which performance those extra test at a performance penalty.

Just generate two executable. One for normal production use. The
other one for developer to find out potential issue. Some of the code
have to be compile differently(ptrlist), it is not easy to implement as turn on
by a command line options.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/ptrlist.h b/ptrlist.h
index d09be2f..4f38a4e 100644
--- a/ptrlist.h
+++ b/ptrlist.h
@@ -25,7 +25,8 @@ 
 #define LIST_NODE_NR (29)

 struct ptr_list {
- int nr;
+ unsigned int nr:16;
+ unsigned int active:16;
  struct ptr_list *prev;
  struct ptr_list *next;
  void *list[LIST_NODE_NR];
@@ -44,6 +45,7 @@  extern void concat_ptr_list(struct ptr_list *a,
struct ptr_list **b);
 extern void __free_ptr_list(struct ptr_list **);
 extern int ptr_list_size(struct ptr_list *);
 extern int linearize_ptr_list(struct ptr_list *, void **, int);
+extern void die(const char *, ...);

 /*
  * Hey, who said that you can't do overloading in C?
@@ -158,6 +160,7 @@  static inline void *last_ptr_list(struct ptr_list *list)
  CHECK_TYPE(head,ptr); \
  if (__head) { \
  do { int __nr; \
+ __list->active++; \
  for (__nr = 0; __nr < __list->nr; __nr++) { \
  do { \
  ptr = PTR_ENTRY(__list,__nr); \
@@ -167,6 +170,7 @@  static inline void *last_ptr_list(struct ptr_list *list)
  } while (0); \
  } while (0); \
  } \
+ __list->active--; \
  } while ((__list = __list->next) != __head); \
  } \
 } while (0)
@@ -179,6 +183,7 @@  static inline void *last_ptr_list(struct ptr_list *list)
  do { int __nr; \
  __list = __list->prev; \
  __nr = __list->nr; \
+ __list->active++; \
  while (--__nr >= 0) { \
  do { \
  ptr = PTR_ENTRY(__list,__nr); \
@@ -189,6 +194,7 @@  static inline void *last_ptr_list(struct ptr_list *list)
  } while (0); \
  } while (0); \
  } \
+ __list->active--; \
  } while (__list != __head); \
  } \
 } while (0)From: Christopher Li <sparse@chrisli.org>
Date: Thu, 6 Jul 2017 16:25:38 -0700
Subject: [PATCH 2/2] check on delete list entry while parent caller using it.

This is checking for type the bug Luc reported.
Basiclly, the parent DO_FOR_EACH() has __nr caching the
current ptr position. When the inner loop function delete
one entry before the current position. The parent current
position needs to be adjust, because the entry[] has been
move forward.

This patch only check usage FOR_EACH_XXX macro. There is
also PREPARE_PTR_LIST haven't cover by this patch.
It is already catching bugs left and right.

Most noticablely remove_usage() inside of the  kill_use_list()
loop.
---
 ptrlist.h | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/ptrlist.h b/ptrlist.h
index d09be2f..4f38a4e 100644
--- a/ptrlist.h
+++ b/ptrlist.h
@@ -25,7 +25,8 @@ 
 #define LIST_NODE_NR (29)

 struct ptr_list {
- int nr;
+ unsigned int nr:16;
+ unsigned int active:16;
  struct ptr_list *prev;
  struct ptr_list *next;
  void *list[LIST_NODE_NR];
@@ -44,6 +45,7 @@  extern void concat_ptr_list(struct ptr_list *a,
struct ptr_list **b);
 extern void __free_ptr_list(struct ptr_list **);
 extern int ptr_list_size(struct ptr_list *);
 extern int linearize_ptr_list(struct ptr_list *, void **, int);
+extern void die(const char *, ...);

 /*
  * Hey, who said that you can't do overloading in C?
@@ -158,6 +160,7 @@  static inline void *last_ptr_list(struct ptr_list *list)
  CHECK_TYPE(head,ptr); \
  if (__head) { \
  do { int __nr; \
+ __list->active++; \
  for (__nr = 0; __nr < __list->nr; __nr++) { \
  do { \
  ptr = PTR_ENTRY(__list,__nr); \
@@ -167,6 +170,7 @@  static inline void *last_ptr_list(struct ptr_list *list)
  } while (0); \
  } while (0); \
  } \
+ __list->active--; \From: Christopher Li <sparse@chrisli.org>
Date: Thu, 6 Jul 2017 16:25:38 -0700
Subject: [PATCH 2/2] check on delete list entry while parent caller using it.

This is checking for type the bug Luc reported.
Basiclly, the parent DO_FOR_EACH() has __nr caching the
current ptr position. When the inner loop function delete
one entry before the current position. The parent current
position needs to be adjust, because the entry[] has been
move forward.

This patch only check usage FOR_EACH_XXX macro. There is
also PREPARE_PTR_LIST haven't cover by this patch.
It is already catching bugs left and right.

Most noticablely remove_usage() inside of the  kill_use_list()
loop.
---
 ptrlist.h | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/ptrlist.h b/ptrlist.h
index d09be2f..4f38a4e 100644
--- a/ptrlist.h
+++ b/ptrlist.h
@@ -25,7 +25,8 @@ 
 #define LIST_NODE_NR (29)

 struct ptr_list {
- int nr;
+ unsigned int nr:16;
+ unsigned int active:16;
  struct ptr_list *prev;
  struct ptr_list *next;
  void *list[LIST_NODE_NR];
@@ -44,6 +45,7 @@  extern void concat_ptr_list(struct ptr_list *a,
struct ptr_list **b);
 extern void __free_ptr_list(struct ptr_list **);
 extern int ptr_list_size(struct ptr_list *);
 extern int linearize_ptr_list(struct ptr_list *, void **, int);
+extern void die(const char *, ...);

 /*