From patchwork Mon Jun 5 10:02:30 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: George Dunlap X-Patchwork-Id: 9766021 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 7CF2B60364 for ; Mon, 5 Jun 2017 10:05:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 854C027CEA for ; Mon, 5 Jun 2017 10:05:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7974626E74; Mon, 5 Jun 2017 10:05:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 59D7F26E74 for ; Mon, 5 Jun 2017 10:04:59 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dHoqN-00080w-CS; Mon, 05 Jun 2017 10:02:39 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dHoqL-00080q-HS for xen-devel@lists.xenproject.org; Mon, 05 Jun 2017 10:02:37 +0000 Received: from [85.158.143.35] by server-6.bemta-6.messagelabs.com id E0/AB-03920-C3C25395; Mon, 05 Jun 2017 10:02:36 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrNLMWRWlGSWpSXmKPExsXitHRDpK61jmm kwcMtkhbft0xmcmD0OPzhCksAYxRrZl5SfkUCa8bqE4tZC65KV0w7sYWpgfGpWBcjB4eEgL/E u69cXYycHGwCehLzjn9lAQmLCKhI3N5rABJmFiiVuL3hCROILQxU/frWXGYQmwWo5MrL/+wgN q+Ah8TDrslgtoSAnMT54z/BaoQEVCUWPzgKVSMocXLmExaImRISB1+8YJ7AyD0LSWoWktQCRq ZVjBrFqUVlqUW6hkZ6SUWZ6RkluYmZObqGBmZ6uanFxYnpqTmJScV6yfm5mxiBgcAABDsYL28 MOMQoycGkJMq7WtE0UogvKT+lMiOxOCO+qDQntfgQowwHh5IE7xxtoJxgUWp6akVaZg4wJGHS Ehw8SiK8DiBp3uKCxNzizHSI1ClGRSlx3lUgCQGQREZpHlwbLA4uMcpKCfMyAh0ixFOQWpSbW YIq/4pRnINRSZj3rRbQFJ7MvBK46a+AFjMBLea7ZAKyuCQRISXVwLh/i0L/yttMFUujW1ZdL7 hbf9vyCst809jmokLTmjfCLYfLtc6KfS4S6Hn3yn0Zj9K9acI11utkJkxgmlV14lbMk+Qvne+ XTJrUEhGs+qBGyuON1BTZ7exFR495c6jsLLvO+HFevvyr8MQsIR9th3AGyx7b7wvYJuXkageH xvXPXdYb84M7UImlOCPRUIu5qDgRAOiHPG5+AgAA X-Env-Sender: prvs=322de8baf=George.Dunlap@citrix.com X-Msg-Ref: server-11.tower-21.messagelabs.com!1496656953!72199800!1 X-Originating-IP: [66.165.176.89] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogNjYuMTY1LjE3Ni44OSA9PiAyMDMwMDc=\n, received_headers: No Received headers X-StarScan-Received: X-StarScan-Version: 9.4.19; banners=-,-,- X-VirusChecked: Checked Received: (qmail 42937 invoked from network); 5 Jun 2017 10:02:35 -0000 Received: from smtp.citrix.com (HELO SMTP.CITRIX.COM) (66.165.176.89) by server-11.tower-21.messagelabs.com with RC4-SHA encrypted SMTP; 5 Jun 2017 10:02:35 -0000 X-IronPort-AV: E=Sophos;i="5.39,300,1493683200"; d="scan'208";a="426555001" From: George Dunlap To: Date: Mon, 5 Jun 2017 11:02:30 +0100 Message-ID: <1496656950-15815-1-git-send-email-george.dunlap@citrix.com> X-Mailer: git-send-email 2.1.4 MIME-Version: 1.0 Cc: Wei Liu , Ian Jackson , George Dunlap Subject: [Xen-devel] [PATCH for 4.9] vif-common.sh: Have iptables wait for the xtables lock X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP iptables has a system-wide lock on the xtables. Strangely though, in the case of two concurrent invocations, the default is for the instance not grabbing the lock to exit out rather than waiting for it. This means that when starting a large number of guests in parallel, many will fail out with messages like this: 2017-05-10 11:45:40 UTC libxl: error: libxl_exec.c:118: libxl_report_child_exitstatus: /etc/xen/scripts/vif-bridge remove [18767] exited with error status 4 2017-05-10 11:50:52 UTC libxl: error: libxl_exec.c:118: libxl_report_child_exitstatus: /etc/xen/scripts/vif-bridge offline [1554] exited with error status 4 In order to instruct iptables to wait for the lock, you have to specify '-w'. Unfortunately, not all versions of iptables have the '-w' option, so on first invocation check to see if it accepts the -w command. Reported-by: Antony Saba Signed-off-by: George Dunlap Acked-by: Ian Jackson --- CC: Ian Jackson CC: Wei Liu --- tools/hotplug/Linux/vif-common.sh | 38 +++++++++++++++++++++++++++++++++++--- 1 file changed, 35 insertions(+), 3 deletions(-) diff --git a/tools/hotplug/Linux/vif-common.sh b/tools/hotplug/Linux/vif-common.sh index 6e8d584..29cd8dd 100644 --- a/tools/hotplug/Linux/vif-common.sh +++ b/tools/hotplug/Linux/vif-common.sh @@ -120,6 +120,38 @@ fi ip=${ip:-} ip=$(xenstore_read_default "$XENBUS_PATH/ip" "$ip") +IPTABLES_WAIT_RUNE="-w" +IPTABLES_WAIT_RUNE_CHECKED=false + +# When iptables introduced locking, in the event of lock contention, +# they made "fail" rather than "wait for the lock" the default +# behavior. In order to select "wait for the lock" behavior, you have +# to add the '-w' parameter. Unfortinately, both the locking and the +# option were only introduced in 2013, and older versions of iptables +# will fail if the '-w' parameter is included (since they don't +# recognize it). So check to see if it's supported the first time we +# use it. +iptables_w() +{ + if ! $IPTABLES_WAIT_RUNE_CHECKED ; then + iptables $IPTABLES_WAIT_RUNE -L -n >& /dev/null + if [[ $? == 0 ]] ; then + # If we succeed, then -w is supported; don't check again + IPTABLES_WAIT_RUNE_CHECKED=true + elif [[ $? == 2 ]] ; then + iptables -L -n >& /dev/null + if [[ $? != 2 ]] ; then + # If we fail with PARAMETER_PROBLEM (2) with -w and + # don't fail with PARAMETER_PROBLEM without it, then + # it's the -w option + IPTABLES_WAIT_RUNE_CHECKED=true + IPTABLES_WAIT_RUNE="" + fi + fi + fi + iptables $IPTABLES_WAIT_RUNE "$@" +} + frob_iptable() { if [ "$command" == "online" -o "$command" == "add" ] @@ -129,9 +161,9 @@ frob_iptable() local c="-D" fi - iptables "$c" FORWARD -m physdev --physdev-is-bridged --physdev-in "$dev" \ + iptables_w "$c" FORWARD -m physdev --physdev-is-bridged --physdev-in "$dev" \ "$@" -j ACCEPT 2>/dev/null && - iptables "$c" FORWARD -m physdev --physdev-is-bridged --physdev-out "$dev" \ + iptables_w "$c" FORWARD -m physdev --physdev-is-bridged --physdev-out "$dev" \ -j ACCEPT 2>/dev/null if [ \( "$command" == "online" -o "$command" == "add" \) -a $? -ne 0 ] @@ -154,7 +186,7 @@ handle_iptable() # binary is not sufficient, because the user may not have the appropriate # modules installed. If iptables is not working, then there's no need to do # anything with it, so we can just return. - if ! iptables -L -n >&/dev/null + if ! iptables_w -L -n >&/dev/null then return fi