From patchwork Mon Jul 2 04:28:58 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Tang X-Patchwork-Id: 1145411 Return-Path: X-Original-To: patchwork-linux-nfs@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork1.kernel.org (Postfix) with ESMTP id 969C540ABE for ; Mon, 2 Jul 2012 04:33:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752034Ab2GBEdP (ORCPT ); Mon, 2 Jul 2012 00:33:15 -0400 Received: from mga01.intel.com ([192.55.52.88]:21757 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751632Ab2GBEdN (ORCPT ); Mon, 2 Jul 2012 00:33:13 -0400 Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga101.fm.intel.com with ESMTP; 01 Jul 2012 21:33:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.71,315,1320652800"; d="scan'208";a="172361607" Received: from feng-i7.sh.intel.com (HELO feng-i7) ([10.239.14.125]) by fmsmga001.fm.intel.com with ESMTP; 01 Jul 2012 21:33:11 -0700 Date: Mon, 2 Jul 2012 12:28:58 +0800 From: Feng Tang To: Fengguang Wu , Christoph Lameter Cc: Trond Myklebust , linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org, netdev , penberg@kernel.org, linux-mm@kvack.org, Stephen Rothwell , bfields@fieldses.org Subject: Re: linux-next BUG: held lock freed! Message-ID: <20120702122858.029946db@feng-i7> In-Reply-To: References: <20120626145432.GA15289@localhost> <20120626172918.GA16446@localhost> <20120627122306.GA19252@localhost> <20120702025625.GA6531@localhost> Organization: intel X-Mailer: Claws Mail 3.7.6 (GTK+ 2.22.0; i486-pc-linux-gnu) Mime-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org > From: Fengguang Wu > Date: 2012/7/2 > Subject: linux-next BUG: held lock freed! > To: Christoph Lameter > Cc: Trond Myklebust , "J. Bruce Fields" < > bfields@fieldses.org>, linux-nfs@vger.kernel.org, LKML < > linux-kernel@vger.kernel.org>, netdev , Pekka > Enberg , Linux Memory Management List < > linux-mm@kvack.org>, Stephen Rothwell > > > Hi all, > > More observations on this bug: > > The slab tree itself actually boots fine. So Christoph's commit may be > merely disclosing some bug hidden in another for-next tree which > happens to be merged before the slab tree.. > > Attached are some more back traces related to this bug (obviuosly, not > only network locks are affected by the bug), as well as the kconfig that > can relatively easily (but not always) trigger this bug. > > Thanks, > Fengguang > > On Wed, Jun 27, 2012 at 08:23:06PM +0800, Fengguang Wu wrote: > > Hi Christoph, > > > > It's a surprise that it bisects down to this commit. I confirmed > > that it boots reliably if reverting this commit on top of linux-next. > > > > 8c138bc00925521c4e764269db3a903bd2a51592 is the first bad commit > > commit 8c138bc00925521c4e764269db3a903bd2a51592 > > Author: Christoph Lameter > > Date: Wed Jun 13 10:24:58 2012 -0500 > > > > slab: Get rid of obj_size macro > > > > The size of the slab object is frequently needed. Since we now > > have a size field directly in the kmem_cache structure there is no > > need anymore of the obj_size macro/function. > > > > Signed-off-by: Christoph Lameter > > Signed-off-by: Pekka Enberg Seems there is a typo in the original patch 8c138bc0: @@ -3896,9 +3890,9 @@ void kmem_cache_free(struct kmem_cache *cachep, void *objp) unsigned long flags; local_irq_save(flags); - debug_check_no_locks_freed(objp, obj_size(cachep)); + debug_check_no_locks_freed(objp, cachep->size); ====> this should be cachep->object_size if (!(cachep->flags & SLAB_DEBUG_OBJECTS)) - debug_check_no_obj_freed(objp, obj_size(cachep)); + debug_check_no_obj_freed(objp, cachep->object_size); So the following small patch may fix it: ---------------------- Thanks, Feng > > > > :040000 040000 e0418be654b66b2364add59bb469024fd6958791 > f6be0da4d4740844ab8a4c561dbe3815a3f9b8b4 M mm > > bisect run success > > > > > > [ 133.909702] ========================= > > > > [ 133.910694] [ BUG: held lock freed! ] > > > > [ 133.911700] 3.5.0-rc4+ #5 Not tainted > > > > [ 133.912672] ------------------------- > > > > [ 133.912969] swapper/0/0 is freeing memory > ffff88001233ce08-ffff88001233de07, with a lock still held there! > > > > [ 133.912969] (slock-AF_INET-RPC/1){+.-...}, at: > [] tcp_v4_rcv+0x28b/0x6fc > > > > [ 133.912969] 3 locks held by swapper/0/0: > > > > [ 133.912969] #0: (rcu_read_lock){.+.+..}, at: > [] rcu_lock_acquire+0x0/0x29 > > > > [ 133.912969] #1: (rcu_read_lock){.+.+..}, at: > [] rcu_lock_acquire.constprop.14+0x0/0x30 > > > > [ 133.912969] #2: (slock-AF_INET-RPC/1){+.-...}, at: > [] tcp_v4_rcv+0x28b/0x6fc > > > > [ 133.912969] > > > > [ 133.912969] stack backtrace: > > > > [ 133.912969] Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc4+ #5 > > > > [ 133.912969] Call Trace: > > > > [ 133.912969] [] > debug_check_no_locks_freed+0x109/0x14b > > > > [ 133.912969] [] kmem_cache_free+0x2e/0xa7 > > > > [ 133.912969] [] __kfree_skb+0x7f/0x83 > > > > [ 133.912969] [] tcp_ack+0x45d/0xc6a > > > > [ 133.912969] [] ? local_clock+0x3b/0x52 > > > > [ 133.912969] [] tcp_rcv_state_process+0x15a/0x7c6 > > > > [ 133.912969] [] tcp_v4_do_rcv+0x341/0x390 > > > > [ 133.912969] [] tcp_v4_rcv+0x678/0x6fc > > > > [ 133.912969] [] > ip_local_deliver_finish+0x165/0x1e4 > > > > [ 133.912969] [] ip_local_deliver+0x53/0x84 > > > > [ 133.912969] [] ? local_clock+0x19/0x52 > > > > [ 133.912969] [] ip_rcv_finish+0x32f/0x367 > > > > [ 133.912969] [] ip_rcv+0x210/0x269 > > > > [ 133.912969] [] ? rcu_lock_acquire+0x27/0x29 > > > > [ 133.912969] [] ? softnet_seq_show+0x68/0x68 > > > > [ 133.912969] [] __netif_receive_skb+0x3cd/0x464 > > > > [ 133.912969] [] netif_receive_skb+0x65/0x9c > > > > [ 133.912969] [] ? __napi_gro_receive+0xf2/0xff > > > > [ 133.912969] [] napi_skb_finish+0x26/0x58 > > > > [ 133.912969] [] ? local_clock+0x19/0x52 > > > > [ 133.912969] [] napi_gro_receive+0x2f/0x34 > > > > [ 133.912969] [] e1000_receive_skb+0x57/0x60 > > > > [ 133.912969] [] e1000_clean_rx_irq+0x2f2/0x387 > > > > [ 133.912969] [] e1000_clean+0x541/0x695 > > > > [ 133.912969] [] ? kvm_clock_read+0x2e/0x36 > > > > [ 133.912969] [] ? net_rx_action+0x1b3/0x1f8 > > > > [ 133.912969] [] net_rx_action+0xb3/0x1f8 > > > > [ 133.912969] [] ? __do_softirq+0x76/0x1e8 > > > > [ 133.912969] [] __do_softirq+0xe0/0x1e8 > > > > [ 133.912969] [] ? time_hardirqs_off+0x26/0x2a > > > > [ 133.912969] [] call_softirq+0x1c/0x30 > > > > [ 133.912969] [] do_softirq+0x4a/0xa2 > > > > [ 133.912969] [] irq_exit+0x51/0xbc > > > > [ 133.912969] [] do_IRQ+0x8e/0xa5 > > > > [ 133.912969] [] common_interrupt+0x6f/0x6f > > > > [ 133.912969] [] ? native_safe_halt+0x6/0x8 > > > > [ 133.912969] [] ? trace_hardirqs_on+0xd/0xf > > > > [ 133.912969] [] default_idle+0x53/0x90 > > > > [ 133.912969] [] cpu_idle+0xcc/0x123 > > > > [ 133.912969] [] rest_init+0xd1/0xda > > > > [ 133.912969] [] ? > csum_partial_copy_generic+0x16c/0x16c > > > > [ 133.912969] [] start_kernel+0x3da/0x3e7 > > > > [ 133.912969] [] ? repair_env_string+0x5a/0x5a > > > > [ 133.912969] [] > x86_64_start_reservations+0xb1/0xb5 > > > > [ 133.912969] [] x86_64_start_kernel+0xfe/0x10b > > > > [ 134.024230] VFS: Mounted root (nfs filesystem) on device 0:14. > > > > Thanks, > > Fengguang > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ Tested-by: Fengguang Wu --- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/mm/slab.c b/mm/slab.c index 64c3d03..605b3b7 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -3890,7 +3890,7 @@ void kmem_cache_free(struct kmem_cache *cachep, void *objp) unsigned long flags; local_irq_save(flags); - debug_check_no_locks_freed(objp, cachep->size); + debug_check_no_locks_freed(objp, cachep->object_size); if (!(cachep->flags & SLAB_DEBUG_OBJECTS)) debug_check_no_obj_freed(objp, cachep->object_size); __cache_free(cachep, objp, __builtin_return_address(0));