From patchwork Wed Mar 30 09:36:59 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 8694251 Return-Path: X-Original-To: patchwork-linux-fsdevel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 9507DC0553 for ; Wed, 30 Mar 2016 09:37:29 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id A975120361 for ; Wed, 30 Mar 2016 09:37:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id AAC5E2035D for ; Wed, 30 Mar 2016 09:37:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751670AbcC3JhD (ORCPT ); Wed, 30 Mar 2016 05:37:03 -0400 Received: from casper.infradead.org ([85.118.1.10]:42627 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751574AbcC3JhC (ORCPT ); Wed, 30 Mar 2016 05:37:02 -0400 Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=twins) by casper.infradead.org with esmtpsa (Exim 4.85 #2 (Red Hat Linux)) id 1alCYd-0008Gi-Ks; Wed, 30 Mar 2016 09:36:59 +0000 Received: by twins (Postfix, from userid 1000) id 14CDD1002EA35; Wed, 30 Mar 2016 11:36:59 +0200 (CEST) Date: Wed, 30 Mar 2016 11:36:59 +0200 From: Peter Zijlstra To: Ingo Molnar Cc: Alfredo Alvarez Fernandez , Linus Torvalds , Sedat Dilek , Theodore Ts'o , linux-fsdevel , LKML Subject: Re: [Linux-v4.6-rc1] ext4: WARNING: CPU: 2 PID: 2692 at kernel/locking/lockdep.c:2017 __lock_acquire+0x180e/0x2260 Message-ID: <20160330093659.GS3408@twins.programming.kicks-ass.net> References: <20160327204810.GW6356@twins.programming.kicks-ass.net> <20160329084701.GA9393@gmail.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20160329084701.GA9393@gmail.com> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Tue, Mar 29, 2016 at 10:47:02AM +0200, Ingo Molnar wrote: > > You are right; this is lockdep running into a hash collision; which is a new > > DEBUG_LOCKDEP test. See 9e4e7554e755 ("locking/lockdep: Detect chain_key > > collisions"). > > I've Cc:-ed Alfredo Alvarez Fernandez who added that test. OK, so while the code in check_no_collision() seems sensible, it relies on borken bits. The whole chain_hlocks and /proc/lockdep_chains stuff appears to have been buggered from the start. The below patch should fix this. Furthermore, our hash function has definite room for improvement. --- include/linux/lockdep.h | 8 +++++--- kernel/locking/lockdep.c | 30 ++++++++++++++++++++++++------ kernel/locking/lockdep_proc.c | 2 ++ 3 files changed, 31 insertions(+), 9 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index d026b190c530..2568c120513b 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -196,9 +196,11 @@ struct lock_list { * We record lock dependency chains, so that we can cache them: */ struct lock_chain { - u8 irq_context; - u8 depth; - u16 base; + /* see BUILD_BUG_ON()s in lookup_chain_cache() */ + unsigned int irq_context : 2, + depth : 6, + base : 24; + /* 4 byte hole */ struct hlist_node entry; u64 chain_key; }; diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index 53ab2f85d77e..91a4b7780afb 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -2099,15 +2099,37 @@ static inline int lookup_chain_cache(struct task_struct *curr, chain->irq_context = hlock->irq_context; i = get_first_held_lock(curr, hlock); chain->depth = curr->lockdep_depth + 1 - i; + + BUILD_BUG_ON((1UL << 24) <= ARRAY_SIZE(chain_hlocks)); + BUILD_BUG_ON((1UL << 6) <= ARRAY_SIZE(curr->held_locks)); + BUILD_BUG_ON((1UL << 8*sizeof(chain_hlocks[0])) <= ARRAY_SIZE(lock_classes)); + if (likely(nr_chain_hlocks + chain->depth <= MAX_LOCKDEP_CHAIN_HLOCKS)) { chain->base = nr_chain_hlocks; - nr_chain_hlocks += chain->depth; for (j = 0; j < chain->depth - 1; j++, i++) { int lock_id = curr->held_locks[i].class_idx - 1; chain_hlocks[chain->base + j] = lock_id; } chain_hlocks[chain->base + j] = class - lock_classes; } + + if (nr_chain_hlocks < MAX_LOCKDEP_CHAIN_HLOCKS) + nr_chain_hlocks += chain->depth; + +#ifdef CONFIG_DEBUG_LOCKDEP + /* + * Important for check_no_collision(). + */ + if (unlikely(nr_chain_hlocks > MAX_LOCKDEP_CHAIN_HLOCKS)) { + if (debug_locks_off_graph_unlock()) + return 0; + + print_lockdep_off("BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!"); + dump_stack(); + return 0; + } +#endif + hlist_add_head_rcu(&chain->entry, hash_head); debug_atomic_inc(chain_lookup_misses); inc_chains(); @@ -2860,11 +2882,6 @@ static int separate_irq_context(struct task_struct *curr, { unsigned int depth = curr->lockdep_depth; - /* - * Keep track of points where we cross into an interrupt context: - */ - hlock->irq_context = 2*(curr->hardirq_context ? 1 : 0) + - curr->softirq_context; if (depth) { struct held_lock *prev_hlock; @@ -3164,6 +3181,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass, hlock->acquire_ip = ip; hlock->instance = lock; hlock->nest_lock = nest_lock; + hlock->irq_context = 2*(!!curr->hardirq_context) + !!curr->softirq_context; hlock->trylock = trylock; hlock->read = read; hlock->check = check; diff --git a/kernel/locking/lockdep_proc.c b/kernel/locking/lockdep_proc.c index dbb61a302548..a0f61effad25 100644 --- a/kernel/locking/lockdep_proc.c +++ b/kernel/locking/lockdep_proc.c @@ -141,6 +141,8 @@ static int lc_show(struct seq_file *m, void *v) int i; if (v == SEQ_START_TOKEN) { + if (nr_chain_hlocks > MAX_LOCKDEP_CHAIN_HLOCKS) + seq_printf(m, "(buggered) "); seq_printf(m, "all lock chains:\n"); return 0; }