diff mbox

slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

Message ID f7tbmytkmtr.fsf@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Aaron Conole Oct. 10, 2016, 1:35 a.m. UTC
Florian Westphal <fw@strlen.de> writes:

> Linus Torvalds <torvalds@linux-foundation.org> wrote:
>> On Sun, Oct 9, 2016 at 12:11 PM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>> >
>> > Anyway, I don't think I can bisect it, but I'll try to narrow it down
>> > a *bit* at least.
>> >
>> > Not doing any more pulls on this unstable base, I've been puttering
>> > around in trying to clean up some stupid printk logging issues
>> > instead.
>> 
>> So I finally got a oops with slub debugging enabled. It doesn't really
>> narrow things down, though, it kind of extends on the possible
>> suspects. Now adding David Miller and Pablo, because it looks like it
>> may be netfilter that does something bad and corrupts memory.
>
> Quite possible, the netns interactions are not nice :-/
>
>> Without further ado, here's the new oops:
>> 
>>    general protection fault: 0000 [#1] SMP
>>    CPU: 7 PID: 169 Comm: kworker/u16:7 Not tainted
>> 4.8.0-11288-gb66484cd7470 #1
>>    Hardware name: System manufacturer System Product Name/Z170-K, BIOS
> ..
>>    Call Trace:
>>      netfilter_net_exit+0x2f/0x60
>>      ops_exit_list.isra.4+0x38/0x60
>>      cleanup_net+0x1ba/0x2a0
>>      process_one_work+0x1f1/0x480
>>      worker_thread+0x48/0x4d0
>>      ? process_one_work+0x480/0x480
>
> ..
>
>> like it's a pointer loaded from a free'd allocation.
>> 
>> The code disassembles to
>> 
>>    0: 0f b6 ca             movzbl %dl,%ecx
>>    3: 48 8d 84 c8 00 01 00 lea    0x100(%rax,%rcx,8),%rax
>>    a: 00
>>    b: 49 8b 5c c5 00       mov    0x0(%r13,%rax,8),%rbx
>>   10: 48 85 db             test   %rbx,%rbx
>>   13: 0f 84 cb 00 00 00     je     0xe4
>>   19: 4c 3b 63 40           cmp    0x40(%rbx),%r12
>>   1d: 48 8b 03             mov    (%rbx),%rax
>>   20: 0f 84 e9 00 00 00     je     0x10f
>>   26: 48 85 c0             test   %rax,%rax
>>   29: 74 26                 je     0x51
>>   2b:* 4c 3b 60 40           cmp    0x40(%rax),%r12 <-- trapping instruction
>>   2f: 75 08                 jne    0x39
>>   31: e9 ef 00 00 00       jmpq   0x125
>>   36: 48 89 d8             mov    %rbx,%rax
>>   39: 48 8b 18             mov    (%rax),%rbx
>>   3c: 48 85 db             test   %rbx,%rbx
>> 
>> and that oopsing instruction seems to be the compare of
>> "hooks_entry->orig_ops" from hooks_entry in this expression:
>> 
>>         if (hooks_entry && hooks_entry->orig_ops == reg) {
>> 
>> so hooks_entry() is bogus. It was gotten from
>> 
>>         hooks_entry = nf_hook_entry_head(net, reg);
>> 
>> but that's as far as I dug. And yes, I do have
>> CONFIG_NETFILTER_INGRESS=y in case that matters.
>> 
>> And all this code has changed pretty radically in commit e3b37f11e6e4
>> ("netfilter: replace list_head with single linked list"), and there
>> was clearly already something wrong with that code, with commit
>> 5119e4381a90 ("netfilter: Fix potential null pointer dereference")
>> adding the test against NULL. But I suspect that only hid the "oops,
>> it's actually not NULL, it loaded some uninitialized value" problem.
>> 
>> Over to the networking guys.. Ideas?
>
> Sorry, not off the top of my head.
> Pablo is currently travelling back home from netdev 1.2 in Tokyo,
> I can help starting Wednesday when I am back.
>
> One shot in the dark (not even compile tested; wonder if we can end up
> zapping bogus hook ...)
>

I was just about to build and test something similar:

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Linus Torvalds Oct. 10, 2016, 2:49 a.m. UTC | #1
On Sun, Oct 9, 2016 at 6:35 PM, Aaron Conole <aconole@redhat.com> wrote:
>
> I was just about to build and test something similar:

So I haven't actually tested that one, but looking at the code, it
really looks very bogus. In fact, that code just looks like crap. It
does *not* do a proper "remove singly linked list entry". It's exactly
the kind of code that I rail against, and that people should never
write.

Any code that can't even traverse a linked list is not worth looking at.

There is one *correct* way to remove an entry from a singly linked
list, and it looks like this:

    struct entry **pp, *p;

    pp = &head;
    while ((p = *pp) != NULL) {
        if (right_entry(p)) {
            *pp = p->next;
            break;
        }
        pp = &p->next;
    }

and that's it. Nothing else. The above code exits the loop with "p"
containing the entry that was removed, or NULL if nothing was. It
can't get any simpler than that, but more importantly, anything more
complicated than that is WRONG.

Seriously, nothing else is acceptable. In particular, any linked list
traversal that makes a special case of the first entry or the last
entry should not be allowed to exist. Note how there is not a single
special case in the above correct code. It JustWorks(tm).

That nf_unregister_net_hook() code has all the signs of exactly that
kind of broken list-handling code: special-casing the head of the
loop, and having the loop condition test both current and that odd
"next to next" pointer etc. It's all very very wrong.

So I really see two options:

 - do that singly-linked list traversal right (and I'm serious:
nothing but the code above can ever be right)

 - don't make up your own list handling code at all, and use the
standard linux list code.

So either e3b37f11e6e4 needs to be reverted, or it needs to be taught
to use real list handling.  If the code doesn't want to use the
regular list.h (either the doubly linked one, or the hlist one), it
needs to at least learn to do list removal right.

               Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index c9d90eb..e84103f 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -189,7 +189,7 @@  void nf_unregister_net_hook(struct net *net, const struct nf_hook_ops *reg)
 
 unlock:
        mutex_unlock(&nf_hook_mutex);
-       if (!hooks_entry) {
+       if (!hooks_entry || hooks_entry->orig_ops != reg) {
                WARN(1, "nf_unregister_net_hook: hook not found!\n");
                return;
        }