From patchwork Tue Jan 24 03:24:29 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Windsor X-Patchwork-Id: 9539433 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id EDA8E60429 for ; Thu, 26 Jan 2017 15:16:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E29B41FF60 for ; Thu, 26 Jan 2017 15:16:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D695A26223; Thu, 26 Jan 2017 15:16:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_MED, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 0426A22376 for ; Thu, 26 Jan 2017 15:16:36 +0000 (UTC) Received: (qmail 25702 invoked by uid 550); 26 Jan 2017 15:16:34 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 25673 invoked from network); 26 Jan 2017 15:16:33 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=8F9jr5FJB3a3e9o/6jAD8YR9ym42JnyrdKAYSuNcYiw=; b=EJxdE51hoDWVni/SQvJR9p4izPwDQiuA3CIhuRXJSgI8rv7/UjdN91fH6XbvG0pHDy haKX6iCqUs9U8PsxSKwzrMAKTv+FBM5P458Ur/GEWF/tKq6/KUDRKLzvT5IZAeM6D7QX XH9HaZqsFwDkTE/NAM1t1dfH9cLI4vdFeXXsugBdkgEKl8pIhYfn9MxTcZX+eBxkRSag gMVKnFbY3hv5vmfPGbR8HnS2PsT293oyibcTXVhOGiH/7LtDZf4GGB6uz5+0j5hGEz2Q DTHsZZz68W2yQHsV33d7QxVc6eJ2KTNlTHkAjc4qQWFXmlXLdFIx3UUyezydqJ7TyS3D Et0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=8F9jr5FJB3a3e9o/6jAD8YR9ym42JnyrdKAYSuNcYiw=; b=gBUB6dpXWOSMPidp8ehw4VNLpetW2Wn8yTPZrdUXF1C5jUhRke9jp3RR+PfZRtENmA p6TvDSUS6RWYwCh/2EAlLS4SVu1zVSr7s/Wx8W7PswR3gwC1JZ6AQqQF0umtxdpldh7Q hYV/qHFg3MocV+jlADyW2gmTNreYy66OmQGEryJJWJwqT7rXj5cWaSTxYoLl3mxdN5BS 8SQszBU3eWsE8q8EuzBZ4oeo3Gv6M5I0RcIrCRCRA5ckyjRO9yu+wyB+KMIcrL5wOZCe /3S27jMwInhi3TKF35WGKtHVowgiWX9s9mPgHPloOhTQJLCTegavKlM+LOWIbGpmdrfE O0BA== X-Gm-Message-State: AIkVDXJnqhoYpNhsDayjejezfN+ZNTTMDp5HkXDiHl07N1ToC8aUJQI/k895kst1lZenLg== X-Received: by 10.129.31.136 with SMTP id f130mr2285654ywf.134.1485443781336; Thu, 26 Jan 2017 07:16:21 -0800 (PST) From: David Windsor To: netdev@vger.kernel.org, kernel-hardening@lists.openwall.com, netfilter-devel@vger.kernel.org Cc: lvs-devel@vger.kernel.org, wensong@linux-vs.org, ja@ssi.bg, horms@verge.net.au, pablo@netfilter.org, dwindsor@gmail.com, keescook@chromium.org, elena.reshetova@intel.com, ishkamiel@gmail.com Date: Mon, 23 Jan 2017 22:24:29 -0500 Message-Id: <1485228269-21758-1-git-send-email-dwindsor@gmail.com> X-Mailer: git-send-email 2.7.4 Subject: [kernel-hardening] [PATCH v2 net] net: free ip_vs_dest structs when refcnt=0 X-Virus-Scanned: ClamAV using ClamSMTP Currently, the ip_vs_dest cache frees ip_vs_dest objects when their reference count becomes < 0. Aside from not being semantically sound, this is problematic for the new type refcount_t, which will be introduced shortly in a separate patch. refcount_t is the new kernel type for holding reference counts, and provides overflow protection and a constrained interface relative to atomic_t (the type currently being used for kernel reference counts). Per Julian Anastasov: "The problem is that dest_trash currently holds deleted dests (unlinked from RCU lists) with refcnt=0." Changing dest_trash to hold dest with refcnt=1 will allow us to free ip_vs_dest structs when their refcnt=0, in ip_vs_dest_put_and_free(). Signed-off-by: David Windsor Signed-off-by: Julian Anastasov Signed-off-by: Simon Horman --- include/net/ip_vs.h | 2 +- net/netfilter/ipvs/ip_vs_ctl.c | 8 +++----- 2 files changed, 4 insertions(+), 6 deletions(-) diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h index cd6018a..a3e78ad 100644 --- a/include/net/ip_vs.h +++ b/include/net/ip_vs.h @@ -1421,7 +1421,7 @@ static inline void ip_vs_dest_put(struct ip_vs_dest *dest) static inline void ip_vs_dest_put_and_free(struct ip_vs_dest *dest) { - if (atomic_dec_return(&dest->refcnt) < 0) + if (atomic_dec_and_test(&dest->refcnt)) kfree(dest); } diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index 55e0169..5fc4836 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -711,7 +711,6 @@ ip_vs_trash_get_dest(struct ip_vs_service *svc, int dest_af, dest->vport == svc->port))) { /* HIT */ list_del(&dest->t_list); - ip_vs_dest_hold(dest); goto out; } } @@ -741,7 +740,7 @@ static void ip_vs_dest_free(struct ip_vs_dest *dest) * When the ip_vs_control_clearup is activated by ipvs module exit, * the service tables must have been flushed and all the connections * are expired, and the refcnt of each destination in the trash must - * be 0, so we simply release them here. + * be 1, so we simply release them here. */ static void ip_vs_trash_cleanup(struct netns_ipvs *ipvs) { @@ -1080,11 +1079,10 @@ static void __ip_vs_del_dest(struct netns_ipvs *ipvs, struct ip_vs_dest *dest, if (list_empty(&ipvs->dest_trash) && !cleanup) mod_timer(&ipvs->dest_trash_timer, jiffies + (IP_VS_DEST_TRASH_PERIOD >> 1)); - /* dest lives in trash without reference */ + /* dest lives in trash with reference */ list_add(&dest->t_list, &ipvs->dest_trash); dest->idle_start = 0; spin_unlock_bh(&ipvs->dest_trash_lock); - ip_vs_dest_put(dest); } @@ -1160,7 +1158,7 @@ static void ip_vs_dest_trash_expire(unsigned long data) spin_lock(&ipvs->dest_trash_lock); list_for_each_entry_safe(dest, next, &ipvs->dest_trash, t_list) { - if (atomic_read(&dest->refcnt) > 0) + if (atomic_read(&dest->refcnt) > 1) continue; if (dest->idle_start) { if (time_before(now, dest->idle_start +