From patchwork Wed Jan 21 19:48:07 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 5680221 Return-Path: X-Original-To: patchwork-linux-nfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id C7E149F4DC for ; Wed, 21 Jan 2015 19:48:54 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 9A8DA20465 for ; Wed, 21 Jan 2015 19:48:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BAC2B204CF for ; Wed, 21 Jan 2015 19:48:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753591AbbAUTsj (ORCPT ); Wed, 21 Jan 2015 14:48:39 -0500 Received: from mail-ig0-f180.google.com ([209.85.213.180]:58702 "EHLO mail-ig0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753205AbbAUTsS (ORCPT ); Wed, 21 Jan 2015 14:48:18 -0500 Received: by mail-ig0-f180.google.com with SMTP id b16so16805967igk.1 for ; Wed, 21 Jan 2015 11:48:17 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:organization:content-type:mime-version :content-transfer-encoding; bh=kO7OH1o2fDWu4j2tnRMl0b150hM7RcxuBFfZ5ZgBeG8=; b=XbEVTb22uaWYJK4P8S0ul+QwUwRDSrf723ydQqvDSTY+7IWBMwcBirPcWIghonVaIv oCFJKyrfgywrxkpCmBlWxEya/FVwqcEX60OPY8fUINXpGReqOs1ED6oFmh4lmmk19BK/ Z7svIASCRMpiWOY/GIg5/bW8ynvKQ5bbuTwn095jYoYLiYmlO7boRXBjvMnyJovvemG0 dCzEGR/4MrNCoSpiAbkxTR6em/d/I9mw57CnTLvoszICm+drShmtZQyT7quXWRZQ+0xn XzqwsaIW3QzaD/fUsCkvRpZXo3YQX4gjmC6f4mY59ThfT1FOG/vME2LlVuuwq+CA5DdC Hzag== X-Gm-Message-State: ALoCoQkxRwCS4SubGJgB3v58QR1o88+hrN0iW1Na8b0e8Wf+H7Y0v0iDY0UXPxz6qHFieAp92kHS X-Received: by 10.50.17.99 with SMTP id n3mr6831113igd.21.1421869690166; Wed, 21 Jan 2015 11:48:10 -0800 (PST) Received: from leira.trondhjem.org (c-68-40-185-14.hsd1.mi.comcast.net. [68.40.185.14]) by mx.google.com with ESMTPSA id qd2sm83673igc.22.2015.01.21.11.48.09 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 21 Jan 2015 11:48:09 -0800 (PST) Message-ID: <1421869687.4674.2.camel@primarydata.com> Subject: Re: Yet another kernel crash in NFS4 state recovery From: Trond Myklebust To: Olga Kornievskaia Cc: "Mkrtchyan, Tigran" , Linux NFS Mailing List Date: Wed, 21 Jan 2015 14:48:07 -0500 In-Reply-To: References: <130621862.279655.1421851650684.JavaMail.zimbra@desy.de> Organization: Primary Data, Inc X-Mailer: Evolution 3.12.9 (3.12.9-1.fc21) Mime-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Wed, 2015-01-21 at 14:09 -0500, Olga Kornievskaia wrote: > On Wed, Jan 21, 2015 at 1:41 PM, Trond Myklebust > wrote: > > On Wed, Jan 21, 2015 at 9:47 AM, Mkrtchyan, Tigran > > wrote: > >> > >> > >> Now with RHEL7. > >> > >> [ 482.016897] BUG: unable to handle kernel NULL pointer dereference at 000000000000001a > >> [ 482.017023] IP: [] rpc_peeraddr2str+0x5/0x30 [sunrpc] > >> [ 482.017023] PGD baefe067 PUD baeff067 PMD 0 > >> [ 482.017023] Oops: 0000 [#1] SMP > >> [ 482.017023] Modules linked in: nfs_layout_nfsv41_files rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ip_tables sg ppdev kvm_intel kvm pcspkr serio_raw virtio_balloon i2c_piix4 parport_pc parport mperf nfsd auth_rpcgss nfs_acl lockd sunrpc sr_mod cdrom ata_generic pata_acpi ext4 mbcache jbd2 virtio_blk cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ttm virtio_net ata_piix drm libata virtio_pci virtio_ring virtio > >> [ 482.017023] i2c_core floppy > >> [ 482.017023] CPU: 0 PID: 2834 Comm: xrootd Not tainted 3.10.0-123.13.2.el7.x86_64 #1 > >> [ 482.017023] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > >> [ 482.017023] task: ffff8800b188cfa0 ti: ffff880232484000 task.ti: ffff880232484000 > >> [ 482.017023] RIP: 0010:[] [] rpc_peeraddr2str+0x5/0x30 [sunrpc] > >> [ 482.017023] RSP: 0018:ffff880232485708 EFLAGS: 00010246 > >> [ 482.017023] RAX: 000000000001bcb0 RBX: ffff880233ded800 RCX: 0000000000000000 > >> [ 482.017023] RDX: ffffffffa0494078 RSI: 0000000000000000 RDI: ffffffffffffffea > >> [ 482.017023] RBP: ffff880232485760 R08: ffff880232485740 R09: 0000000000000000 > >> [ 482.017023] R10: 0000000000000000 R11: fffffffffffffff2 R12: ffff8800bac3e690 > >> [ 482.017023] R13: ffff8800bac3e638 R14: 0000000000000000 R15: 0000000000000000 > >> [ 482.017023] FS: 00007f0d84b79700(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000 > >> [ 482.017023] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > >> [ 482.017023] CR2: 000000000000001a CR3: 00000000baefd000 CR4: 00000000000006f0 > >> [ 482.017023] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > >> [ 482.017023] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > >> [ 482.017023] Stack: > >> [ 482.017023] ffffffffa04c79a5 0000000000000000 ffff880232485768 ffffffffa046d858 > >> [ 482.017023] 0000000000000000 ffff8800b188cfa0 ffffffff81086ac0 ffff880232485740 > >> [ 482.017023] ffff880232485740 0000000096605de3 ffff880233ded800 ffff880232485778 > >> [ 482.017023] Call Trace: > >> [ 482.017023] [] ? nfs4_schedule_state_manager+0x65/0xf0 [nfsv4] > >> [ 482.017023] [] ? nfs_wait_client_init_complete.part.6+0x98/0xd0 [nfs] > >> [ 482.017023] [] ? wake_up_bit+0x30/0x30 > >> [ 482.017023] [] nfs4_schedule_lease_recovery+0x2e/0x60 [nfsv4] > >> [ 482.017023] [] nfs41_walk_client_list+0x104/0x340 [nfsv4] > >> [ 482.017023] [] nfs41_discover_server_trunking+0x39/0x40 [nfsv4] > >> [ 482.017023] [] nfs4_discover_server_trunking+0x7d/0x2e0 [nfsv4] > >> [ 482.017023] [] nfs4_init_client+0x124/0x2f0 [nfsv4] > >> [ 482.017023] [] ? __fscache_acquire_cookie+0x74/0x2a0 [fscache] > >> [ 482.017023] [] ? __fscache_acquire_cookie+0x74/0x2a0 [fscache] > >> [ 482.017023] [] ? generic_lookup_cred+0x15/0x20 [sunrpc] > >> [ 482.017023] [] ? __rpc_init_priority_wait_queue+0x81/0xc0 [sunrpc] > >> [ 482.017023] [] ? rpc_init_wait_queue+0x13/0x20 [sunrpc] > >> [ 482.017023] [] ? nfs4_alloc_client+0x189/0x1e0 [nfsv4] > >> [ 482.017023] [] nfs_get_client+0x26a/0x320 [nfs] > >> [ 482.017023] [] nfs4_set_ds_client+0x8e/0xe0 [nfsv4] > >> [ 482.017023] [] nfs4_fl_prepare_ds+0xe9/0x298 [nfs_layout_nfsv41_files] > >> [ 482.017023] [] filelayout_read_pagelist+0x56/0x170 [nfs_layout_nfsv41_files] > >> [ 482.017023] [] pnfs_generic_pg_readpages+0xe7/0x270 [nfsv4] > >> [ 482.017023] [] nfs_pageio_doio+0x19/0x50 [nfs] > >> [ 482.017023] [] nfs_pageio_complete+0x24/0x30 [nfs] > >> [ 482.017023] [] nfs_readpages+0x16a/0x1d0 [nfs] > >> [ 482.017023] [] ? __page_cache_alloc+0x87/0xb0 > >> [ 482.017023] [] __do_page_cache_readahead+0x1cc/0x250 > >> [ 482.017023] [] ondemand_readahead+0x126/0x240 > >> [ 482.017023] [] page_cache_sync_readahead+0x31/0x50 > >> [ 482.017023] [] generic_file_aio_read+0x1ab/0x750 > >> [ 482.017023] [] nfs_file_read+0x71/0xf0 [nfs] > >> [ 482.017023] [] do_sync_read+0x8d/0xd0 > >> [ 482.017023] [] vfs_read+0x9c/0x170 > >> [ 482.017023] [] SyS_pread64+0x92/0xc0 > >> [ 482.017023] [] system_call_fastpath+0x16/0x1b > >> [ 482.017023] Code: c3 0f 1f 44 00 00 0f 1f 44 00 00 55 48 c7 47 50 40 72 1d a0 48 89 e5 5d c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 <48> 8b 47 30 89 f6 55 48 c7 c2 d8 da 1f a0 48 89 e5 48 8b 84 f0 > >> [ 482.017023] RIP [] rpc_peeraddr2str+0x5/0x30 [sunrpc] > >> [ 482.017023] RSP > >> [ 482.017023] CR2: 000000000000001a > >> > >> > >> Looks like clp->cl_rpcclient point to nowhere when nfs4_schedule_state_manager is called. > >> > > > > I'm guessing > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=080af20cc945d110f9912d01cf6b66f94a375b8d > > > > The Oops is seen even with that patch. As I was explained, in the > commit you pointed at the whole client structure is null. In this case > it's the rpcclient structure that's invalid. Ah. You are right... Tigran, how about the following patch? Cheers Trond 8<--------------------------------------------------------------------- From eb8720a31e1d36415c7377f287d5d217540830c3 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Wed, 21 Jan 2015 14:37:44 -0500 Subject: [PATCH] NFSv4.1: Fix an Oops in nfs41_walk_client_list If we start state recovery on a client that failed to initialise correctly, then we are very likely to Oops. Reported-by: "Mkrtchyan, Tigran" Link: http://lkml.kernel.org/r/130621862.279655.1421851650684.JavaMail.zimbra@desy.de Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust --- fs/nfs/nfs4client.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c index 953daa44a282..706ad10b8186 100644 --- a/fs/nfs/nfs4client.c +++ b/fs/nfs/nfs4client.c @@ -639,7 +639,7 @@ int nfs41_walk_client_list(struct nfs_client *new, prev = pos; status = nfs_wait_client_init_complete(pos); - if (status == 0) { + if (pos->cl_cons_state == NFS_CS_SESSION_INITING) { nfs4_schedule_lease_recovery(pos); status = nfs4_wait_clnt_recover(pos); }