From patchwork Tue May 23 05:29:39 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: linzhang X-Patchwork-Id: 9741929 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B492E60380 for ; Tue, 23 May 2017 05:30:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9FE0E26224 for ; Tue, 23 May 2017 05:30:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 944A728773; Tue, 23 May 2017 05:30:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BE23C26224 for ; Tue, 23 May 2017 05:30:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761093AbdEWF3v (ORCPT ); Tue, 23 May 2017 01:29:51 -0400 Received: from mail-pf0-f193.google.com ([209.85.192.193]:33049 "EHLO mail-pf0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759747AbdEWF3q (ORCPT ); Tue, 23 May 2017 01:29:46 -0400 Received: by mail-pf0-f193.google.com with SMTP id f27so24974526pfe.0; Mon, 22 May 2017 22:29:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=6rcRU3JAem3J3GLhM+ziqhHFJgwuvh8bipjI6ENJ8kQ=; b=azag9sEmyS2b64AvIx2ESEiTeXpkhd7iqlzgpd7yK6dfiygf7rlW6ewRrgy9U+DWPL ulrJBGmzyQSff4Ghaa0681LI4M4pqA2TWWsBK6fPoXIyfNUQZniTIRSN12jqadcjoAsi tnhJ4OM5Xv0DfXC0jejrafvJ0/68M2CQsXbf6v0bqogpGALGue9RUc6ijnOWUIY3xA6R TqVVaP9EDPzE1NYHbo8Q+HsStEqrY3pnMi3oFBqOHwvlrAnPst6G6tijYJvzBrI3ViFx W9I2+ecQS1MmRAQgMmT47qNnjZTB3FpN8q3PGqMr0i2B0kDFjttr9xH1pzutpmlmL1Ie odLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=6rcRU3JAem3J3GLhM+ziqhHFJgwuvh8bipjI6ENJ8kQ=; b=fhdbojI8MM+NW2V+wEnRcD7NYwH01PH++8boCOeIlcUp+Rzz3JNtpGlt+7JSlKMPhP WJmb2Iu8nk6bzVZ+rIz0gpdjZrwmKfNuFpVqYXKRh/eFJRe9l+W4UhX1i9mR5rAy7aud sYHU5EoJun2KPOH0HFCLsnATofMjXxovqkPZSNaBKk/RuzWqOOVAc853xRSR2mZ5QAvo VfXvr7rXL03FZi1jwUwwxQNu3hmNzNcbYIvy3SIT1ag9j1MqeeND2LpjJun07ya0+fpS GaihtQ2Z0NABHn5l2AfQ2SpsTGsvQh2vvLTYNf4JDwfWdH7L5YFpxGIRSyIRKgwIMGDl fhEw== X-Gm-Message-State: AODbwcD9FtxCig5ymeQzzQBK20koYFxWwsM2wbVxdyI59+5GLkUHirku bL/MKfnj95HH0w== X-Received: by 10.99.44.83 with SMTP id s80mr14105118pgs.78.1495517385541; Mon, 22 May 2017 22:29:45 -0700 (PDT) Received: from localhost.localdomain.localdomain ([222.209.233.28]) by smtp.gmail.com with ESMTPSA id b126sm37353994pga.3.2017.05.22.22.29.42 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 May 2017 22:29:44 -0700 (PDT) From: Lin Zhang To: aar@pengutronix.de, stefan@osg.samsung.com, davem@davemloft.net Cc: linux-wpan@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Lin Zhang Subject: [PATCH net v2 2/2] net: ieee802154: fix net_device reference release too early Date: Tue, 23 May 2017 13:29:39 +0800 Message-Id: <1495517379-18456-1-git-send-email-xiaolou4617@gmail.com> X-Mailer: git-send-email 1.8.3.1 Sender: linux-wpan-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-wpan@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch fixes the kernel oops when release net_device reference in advance. In function raw_sendmsg(i think the dgram_sendmsg has the same problem), there is a race condition between dev_put and dev_queue_xmit when the device is gong that maybe lead to dev_queue_ximt to see an illegal net_device pointer. My test kernel is 3.13.0-32 and because i am not have a real 802154 device, so i change lowpan_newlink function to this: /* find and hold real wpan device */ real_dev = dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); if (!real_dev) return -ENODEV; // if (real_dev->type != ARPHRD_IEEE802154) { // dev_put(real_dev); // return -EINVAL; // } lowpan_dev_info(dev)->real_dev = real_dev; lowpan_dev_info(dev)->fragment_tag = 0; mutex_init(&lowpan_dev_info(dev)->dev_list_mtx); Also, in order to simulate preempt, i change the raw_sendmsg function to this: skb->dev = dev; skb->sk = sk; skb->protocol = htons(ETH_P_IEEE802154); dev_put(dev); //simulate preempt schedule_timeout_uninterruptible(30 * HZ); err = dev_queue_xmit(skb); if (err > 0) err = net_xmit_errno(err); and this is my userspace test code named test_send_data: int main(int argc, char **argv) { char buf[127]; int sockfd; sockfd = socket(AF_IEEE802154, SOCK_RAW, 0); if (sockfd < 0) { printf("create sockfd error: %s\n", strerror(errno)); return -1; } send(sockfd, buf, sizeof(buf), 0); return 0; } This is my test case: root@zhanglin-x-computer:~/develop/802154# uname -a Linux zhanglin-x-computer 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux root@zhanglin-x-computer:~/develop/802154# ip link add link eth0 name lowpan0 type lowpan root@zhanglin-x-computer:~/develop/802154# //keep the lowpan0 device down root@zhanglin-x-computer:~/develop/802154# ./test_send_data & //wait a while root@zhanglin-x-computer:~/develop/802154# ip link del link dev lowpan0 //the device is gone //oops [381.303307] general protection fault: 0000 [#1]SMP [381.303407] Modules linked in: af_802154 6lowpan bnep rfcomm bluetooth nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek rts5139(C) snd_hda_intel snd_had_codec snd_hwdep snd_pcm snd_page_alloc snd_seq_midi snd_seq_midi_event snd_rawmidi snd_req intel_rapl snd_seq_device coretemp i915 kvm_intel kvm snd_timer snd crct10dif_pclmul crc32_pclmul ghash_clmulni_intel cypted drm_kms_helper drm i2c_algo_bit soundcore video mac_hid parport_pc ppdev ip parport hid_generic usbhid hid ahci r8169 mii libahdi [381.304286] CPU:1 PID: 2524 Commm: 1 Tainted: G C 0 3.13.0-32-generic [381.304409] Hardware name: Haier Haier DT Computer/Haier DT Codputer, BIOS FIBT19H02_X64 06/09/2014 [381.304546] tasks: ffff000096965fc0 ti: ffffB0013779c000 task.ti: ffffB8013779c000 [381.304659] RIP: 0010:[] [] __dev_queue_ximt+0x61/0x500 [381.304798] RSP: 0018:ffffB8013779dca0 EFLAGS: 00010202 [381.304880] RAX: 272b031d57565351 RBX: 0000000000000000 RCX: ffff8800968f1a00 [381.304987] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800968f1a00 [381.305095] RBP: ffff8e013773dce0 R08: 0000000000000266 R09: 0000000000000004 [381.305202] R10: 0000000000000004 R11: 0000000000000005 R12: ffff88013902e000 [381.305310] R13: 000000000000007f R14: 000000000000007f R15: ffff8800968f1a00 [381.305418] FS: 00007fc57f50f740(0000) GS: ffff88013fc80000(0000) knlGS: 0000000000000000 [381.305540] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [381.305627] CR2: 00007fad0841c000 CR3: 00000001368dd000 CR4: 00000000001007e0 [361.905734] Stack: [381.305768] 00000000002052d0 000000003facb30a ffff88013779dcc0 ffff880137764000 [381.305898] ffff88013779de70 000000000000007f 000000000000007f ffff88013902e000 [381.306026] ffff88013779dcf0 ffffffff81622490 ffff88013779dd39 ffffffffa03af9f1 [381.306155] Call Trace: [381.306202] [] dev_queue_xmit+0x10/0x20 [381.306294] [] raw_sendmsg+0x1b1/0x270 [af_802154] [381.306396] [] ieee802154_sock_sendmsg+0x14/0x20 [af_802154] [381.306512] [] sock_sendmsg+0x8b/0xc0 [381.306600] [] ? __d_alloc+0x25/0x180 [381.306687] [] ? kmem_cache_alloc_trace+0x1c6/0x1f0 [381.306791] [] SYSC_sendto+0x121/0x1c0 [381.306878] [] ? vtime_account_user+x54/0x60 [381.306975] [] ? syscall_trace_enter+0x145/0x250 [381.307073] [] SyS_sendto+0xe/0x10 [381.307156] [] tracesys+0xe1/0xe6 [381.307233] Code: c6 a1 a4 ff 41 8b 57 78 49 8b 47 20 85 d2 48 8b 80 78 07 00 00 75 21 49 8b 57 18 48 85 d2 74 18 48 85 c0 74 13 8b 92 ac 01 00 00 <3b> 50 10 73 08 8b 44 90 14 41 89 47 78 41 f6 84 24 d5 00 00 00 [381.307801] RIP [] _dev_queue_xmit+0x61/0x500 [381.307901] RSP [381.347512] Kernel panic - not syncing: Fatal exception in interrupt [381.347747] drm_kms_helper: panic occurred, switching back to text console In my opinion, there is always exist a chance that the device is gong before call dev_queue_xmit. I think the latest kernel is have the same problem and that dev_put should be behind of the dev_queue_xmit. Signed-off-by: Lin Zhang Acked-by: Stefan Schmidt --- changelog: v1 -> v2: * split v1 into two patches, per Stefan Schmidt. Hello, Stefan: If you have a real 802154 device, maybe use the test case as above, thanks. Thanks to Stefan Schmidt for reviewing ! --- net/ieee802154/socket.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c index b01a1f0..a60658c 100644 --- a/net/ieee802154/socket.c +++ b/net/ieee802154/socket.c @@ -303,12 +303,12 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) skb->dev = dev; skb->protocol = htons(ETH_P_IEEE802154); - dev_put(dev); - err = dev_queue_xmit(skb); if (err > 0) err = net_xmit_errno(err); + dev_put(dev); + return err ?: size; out_skb: @@ -691,12 +691,12 @@ static int dgram_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) skb->dev = dev; skb->protocol = htons(ETH_P_IEEE802154); - dev_put(dev); - err = dev_queue_xmit(skb); if (err > 0) err = net_xmit_errno(err); + dev_put(dev); + return err ?: size; out_skb: