diff mbox

I_CLEAR bug while shrinking dcache

Message ID 20150414081254.GD14691@type.youpi.perso.aquilenet.fr (mailing list archive)
State New, archived
Headers show

Commit Message

Samuel Thibault April 14, 2015, 8:12 a.m. UTC
Hello,

This morning I got the following with linux 4.0.0:

Apr 14 08:36:28 type kernel: [46936.950171] kernel BUG at fs/inode.c:1475!
Apr 14 08:36:28 type kernel: [46936.950206] invalid opcode: 0000 [#1] SMP 
Apr 14 08:36:28 type kernel: [46936.950246] Modules linked in: tun xt_REDIRECT nf_nat_redirect xt_tcpudp ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables binfmt_misc nfsd cpufreq_powersave auth_rpcgss cpufreq_stats oid_registry nfs_acl cpufreq_userspace lockd cpufreq_conservative bbswitch(O) grace sunrpc joydev hid_generic usbhid hid cdc_mbim cdc_ncm usbnet mii cdc_acm cdc_wdm arc4 brcmsmac cordic brcmutil b43 nls_utf8 nls_cp437 mac80211 vfat fat cfg80211 x86_pkg_temp_thermal i915 kvm_intel kvm snd_hda_codec_hdmi ssb rng_core pcmcia pcmcia_core crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 dell_wmi lrw sparse_keymap gf128mul drm_kms_helper dell_laptop glue_helper rfkill ablk_helper cryptd drm psmouse dcdbas evdev serio_raw bcma i2c_i801 acpi_cpufreq tpm_tis wmi i2c_algo_bit tpm i2c_core 8250_fintek video ac battery processor button snd_hda_codec_idt snd_hda_codec_generic ledtrig_timer ledtrig_oneshot ledtrig_heartbeat ledtrig_default_on snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore coretemp ohci_hcd ledtrig_backlight pcspkr i8k firewire_sbp2 firewire_core crc_itu_t loop fuse parport_pc ppdev lp parport autofs4 microcode crc32c_intel ehci_pci ehci_hcd sdhci_pci sr_mod sdhci cdrom sg mmc_core usbcore e1000e ptp usb_common pps_core thermal thermal_sys
Apr 14 08:36:28 type kernel: [46936.951610] CPU: 3 PID: 63 Comm: kswapd0 Tainted: G        W  O    4.0.0 #87
Apr 14 08:36:28 type kernel: [46936.951666] Hardware name: Dell Inc. Latitude E6420/      , BIOS A14 07/11/2012
Apr 14 08:36:28 type kernel: [46936.951723] task: ffff8800b64a68d0 ti: ffff8800b1310000 task.ti: ffff8800b1310000
Apr 14 08:36:28 type kernel: [46936.951781] RIP: 0010:[<ffffffff8113eb4b>]  [<ffffffff8113eb4b>] iput+0x23/0x194
Apr 14 08:36:28 type kernel: [46936.951845] RSP: 0018:ffff8800b1313bd8  EFLAGS: 00010202
Apr 14 08:36:28 type kernel: [46936.951887] RAX: 0000000000000000 RBX: ffff880098c92c68 RCX: ffff8800928883e0
Apr 14 08:36:28 type kernel: [46936.951942] RDX: ffff880098c92d88 RSI: 000000000000f724 RDI: ffff880098c92c68
Apr 14 08:36:28 type kernel: [46936.951997] RBP: ffff880098c92cf0 R08: 0000000000000000 R09: ffff8800b2bddd58
Apr 14 08:36:28 type kernel: [46936.952053] R10: 000000000000b53e R11: ffffffff81a04b80 R12: ffff8800b2bdfc70
Apr 14 08:36:28 type kernel: [46936.952108] R13: ffff8800b1313c68 R14: 0000000000000000 R15: 0000000000000000
Apr 14 08:36:28 type kernel: [46936.952163] FS:  0000000000000000(0000) GS:ffff88013dc60000(0000) knlGS:0000000000000000
Apr 14 08:36:28 type kernel: [46936.952226] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 14 08:36:28 type kernel: [46936.952270] CR2: 00007f4cc1f43920 CR3: 000000000180d000 CR4: 00000000000406e0
Apr 14 08:36:28 type kernel: [46936.952325] Stack:
Apr 14 08:36:28 type kernel: [46936.952342]  ffff8800928883e0 ffff8800b2bdfc18 ffff880098c92c68 ffff8800b2bdfc70
Apr 14 08:36:28 type kernel: [46936.952410]  ffff8800b1313c68 ffffffff8113b2c1 ffff8800b2bdfc98 ffff880129310c18
Apr 14 08:36:28 type kernel: [46936.952477]  ffff8800b2bdfc18 ffffffff8113b6d5 0000000000000000 ffff8800b1313d38
Apr 14 08:36:28 type kernel: [46936.954381] Call Trace:
Apr 14 08:36:28 type kernel: [46936.956290]  [<ffffffff8113b2c1>] ? __dentry_kill+0xfa/0x15c
Apr 14 08:36:28 type kernel: [46936.958219]  [<ffffffff8113b6d5>] ? shrink_dentry_list+0xfc/0x18a
Apr 14 08:36:28 type kernel: [46936.960159]  [<ffffffff8113c26f>] ? prune_dcache_sb+0x49/0x53
Apr 14 08:36:28 type kernel: [46936.962061]  [<ffffffff8112d893>] ? super_cache_scan+0xf2/0x16c
Apr 14 08:36:28 type kernel: [46936.963989]  [<ffffffff810ef6ad>] ? shrink_slab.part.53.constprop.71+0x1a6/0x2ac
Apr 14 08:36:28 type kernel: [46936.965882]  [<ffffffff810f1a38>] ? shrink_zone+0x6b/0x131
Apr 14 08:36:28 type kernel: [46936.967746]  [<ffffffff810f26ca>] ? kswapd+0x57a/0x740
Apr 14 08:36:28 type kernel: [46936.969549]  [<ffffffff810f2150>] ? zone_reclaim+0x1cb/0x1cb
Apr 14 08:36:28 type kernel: [46936.971322]  [<ffffffff81060503>] ? kthread+0xab/0xb3
Apr 14 08:36:28 type kernel: [46936.973026]  [<ffffffff81060458>] ? __kthread_parkme+0x5d/0x5d
Apr 14 08:36:28 type kernel: [46936.974694]  [<ffffffff81448d58>] ? ret_from_fork+0x58/0x90
Apr 14 08:36:28 type kernel: [46936.976362]  [<ffffffff81060458>] ? __kthread_parkme+0x5d/0x5d
Apr 14 08:36:28 type kernel: [46936.977944] Code: ff 05 f2 1c ed 7e 58 c3 48 85 ff 0f 84 8a 01 00 00 41 55 41 54 55 53 48 8d af 88 00 00 00 51 f6 87 a0 00 00 00 40 48 89 fb 74 47 <0f> 0b 83 7b 48 00 48 8b 83 a0 00 00 00 74 53 f6 c4 08 74 4e 4c 
Apr 14 08:36:28 type kernel: [46936.982994] RIP  [<ffffffff8113eb4b>] iput+0x23/0x194
Apr 14 08:36:28 type kernel: [46936.984566]  RSP <ffff8800b1313bd8>
Apr 14 08:36:28 type kernel: [46936.991466] ---[ end trace de34697426b2c75a ]---

So it seems that shrink_dentry_list was trying to kill an inode which
already has I_CLEAR.  I'll be trying the attached patch in the next
days.

Samuel

Comments

Al Viro April 14, 2015, 12:44 p.m. UTC | #1
On Tue, Apr 14, 2015 at 10:12:54AM +0200, Samuel Thibault wrote:

> So it seems that shrink_dentry_list was trying to kill an inode which
> already has I_CLEAR.  I'll be trying the attached patch in the next
> days.

That patch is just papering over the problem.  What you have is a dentry
with ->d_inode pointing to something with I_CLEAR set.

So either
	* dentry has already been through __dentry_kill()
	* or you have live dentry with ->d_inode somehow not contributing
to refcount of ->d_inode
	* or you have an inode with positive refcount and I_CLEAR set.

The first variant can happen, but in that case it should have had
DCACHE_DENTRY_KILLED set by said __dentry_kill() and skipped.  The other
two are clear inode refcounting bugs.

Do you have a reproducer for that?
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Samuel Thibault April 14, 2015, 2:44 p.m. UTC | #2
Al Viro, le Tue 14 Apr 2015 13:44:29 +0100, a écrit :
> On Tue, Apr 14, 2015 at 10:12:54AM +0200, Samuel Thibault wrote:
> > So it seems that shrink_dentry_list was trying to kill an inode which
> > already has I_CLEAR.  I'll be trying the attached patch in the next
> > days.
> 
> That patch is just papering over the problem.

I never said it was a proper fix :)

> The first variant can happen, but in that case it should have had
> DCACHE_DENTRY_KILLED set by said __dentry_kill() and skipped.  The other
> two are clear inode refcounting bugs.
> 
> Do you have a reproducer for that?

It just happened in the middle of the night.

Samuel
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Samuel Thibault April 20, 2015, 10:58 p.m. UTC | #3
Samuel Thibault, le Tue 14 Apr 2015 16:44:02 +0200, a écrit :
> Al Viro, le Tue 14 Apr 2015 13:44:29 +0100, a écrit :
> > The first variant can happen, but in that case it should have had
> > DCACHE_DENTRY_KILLED set by said __dentry_kill() and skipped.  The other
> > two are clear inode refcounting bugs.
> > 
> > Do you have a reproducer for that?
> 
> It just happened in the middle of the night.

I haven't seen it happening again since then.

Samuel
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -949,6 +949,14 @@  static void shrink_dentry_list(struct li
 				spin_unlock(&parent->d_lock);
 			continue;
 		}
+		if (inode && (inode->i_state & I_CLEAR)) {
+			spin_unlock(&inode->i_lock);
+			d_shrink_add(dentry, list);
+			spin_unlock(&dentry->d_lock);
+			if (parent)
+				spin_unlock(&parent->d_lock);
+			continue;
+		}
 
 		__dentry_kill(dentry);