From patchwork Fri Mar 19 03:12:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 12149885 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2107FC433DB for ; Fri, 19 Mar 2021 03:16:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E206364EF2 for ; Fri, 19 Mar 2021 03:16:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233466AbhCSDPk (ORCPT ); Thu, 18 Mar 2021 23:15:40 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:21252 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233461AbhCSDPT (ORCPT ); Thu, 18 Mar 2021 23:15:19 -0400 Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 12J3AsB1027576 for ; Thu, 18 Mar 2021 20:15:19 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-type : content-transfer-encoding : mime-version; s=facebook; bh=EgpF8jRdytYlMVLF25N7Soewb8CI9lGZks4EC04LQ3I=; b=DZiPEXVvxynMmPABchHDegPrsYHFDdGcgxsnf49lVQP/1u0wPSVblbMpWiQDCRVoQ61K 93zwn2aumUkZD1t0NJQeR/RLDBtXrrS47+kufz792Tdzma8UBSRWNIU99ZyV45b2k093 WAE/IqJLRUh+w2ESnaWy2gMqzf++8yrOh7I= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com with ESMTP id 37c4vnd9hf-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 18 Mar 2021 20:15:19 -0700 Received: from intmgw001.05.ash7.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c085:21d::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 18 Mar 2021 20:15:18 -0700 Received: by devbig003.ftw2.facebook.com (Postfix, from userid 128203) id BD4427A9D89; Thu, 18 Mar 2021 20:12:36 -0700 (PDT) From: Yonghong Song To: CC: Alexei Starovoitov , Daniel Borkmann , , Jiri Olsa , Roman Gushchin Subject: [PATCH bpf-next v2 1/2] bpf: fix NULL pointer dereference in bpf_get_local_storage() helper Date: Thu, 18 Mar 2021 20:12:36 -0700 Message-ID: <20210319031236.3709026-1-yhs@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210319031231.3707705-1-yhs@fb.com> References: <20210319031231.3707705-1-yhs@fb.com> X-FB-Internal: Safe X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.369,18.0.761 definitions=2021-03-18_19:2021-03-17,2021-03-18 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 priorityscore=1501 phishscore=0 suspectscore=0 adultscore=0 mlxscore=0 malwarescore=0 impostorscore=0 lowpriorityscore=0 bulkscore=0 spamscore=0 mlxlogscore=999 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2103190022 X-FB-Internal: deliver Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Jiri Olsa reported a bug ([1]) in kernel where cgroup local storage pointer may be NULL in bpf_get_local_storage() helper. There are two issues uncovered by this bug: (1). kprobe or tracepoint prog incorrectly sets cgroup local storage before prog run, (2). due to change from preempt_disable to migrate_disable, preemption is possible and percpu storage might be overwritten by other tasks. This issue (1) is fixed in [2]. This patch tried to address issue (2). The following shows how things can go wrong: task 1: bpf_cgroup_storage_set() for percpu local storage preemption happens task 2: bpf_cgroup_storage_set() for percpu local storage preemption happens task 1: run bpf program task 1 will effectively use the percpu local storage setting by task 2 which will be either NULL or incorrect ones. Instead of just one common local storage per cpu, this patch fixed the issue by permitting 8 local storages per cpu and each local storage is identified by a task_struct pointer. This way, we allow at most 8 nested preemption between bpf_cgroup_storage_set() and bpf_cgroup_storage_unset(). The percpu local storage slot is released (calling bpf_cgroup_storage_unset()) by the same task after bpf program finished running. The patch is tested on top of [2] with reproducer in [1]. Without this patch, kernel will emit error in 2-3 minutes. With this patch, after one hour, still no error. [1] https://lore.kernel.org/bpf/CAKH8qBuXCfUz=w8L+Fj74OaUpbosO29niYwTki7e3Ag044_aww@mail.gmail.com/T [2] https://lore.kernel.org/bpf/CAKH8qBuXCfUz=w8L+Fj74OaUpbosO29niYwTki7e3Ag044_aww@mail.gmail.com/T Cc: Jiri Olsa Cc: Roman Gushchin Signed-off-by: Yonghong Song --- include/linux/bpf-cgroup.h | 58 ++++++++++++++++++++++++++++++++------ include/linux/bpf.h | 22 ++++++++++++--- kernel/bpf/helpers.c | 15 +++++++--- kernel/bpf/local_storage.c | 5 ++-- 4 files changed, 82 insertions(+), 18 deletions(-) diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h index c42e02b4d84b..24689416b534 100644 --- a/include/linux/bpf-cgroup.h +++ b/include/linux/bpf-cgroup.h @@ -20,14 +20,25 @@ struct bpf_sock_ops_kern; struct bpf_cgroup_storage; struct ctl_table; struct ctl_table_header; +struct task_struct; #ifdef CONFIG_CGROUP_BPF extern struct static_key_false cgroup_bpf_enabled_key[MAX_BPF_ATTACH_TYPE]; #define cgroup_bpf_enabled(type) static_branch_unlikely(&cgroup_bpf_enabled_key[type]) -DECLARE_PER_CPU(struct bpf_cgroup_storage*, - bpf_cgroup_storage[MAX_BPF_CGROUP_STORAGE_TYPE]); +#define BPF_CGROUP_STORAGE_NEST_MAX 8 + +struct bpf_cgroup_storage_info { + struct task_struct *task; + struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE]; +}; + +/* For each cpu, permit maximum BPF_CGROUP_STORAGE_NEST_MAX number of tasks + * to use bpf cgroup storage simultaneously. + */ +DECLARE_PER_CPU(struct bpf_cgroup_storage_info, + bpf_cgroup_storage_info[BPF_CGROUP_STORAGE_NEST_MAX]); #define for_each_cgroup_storage_type(stype) \ for (stype = 0; stype < MAX_BPF_CGROUP_STORAGE_TYPE; stype++) @@ -161,13 +172,43 @@ static inline enum bpf_cgroup_storage_type cgroup_storage_type( return BPF_CGROUP_STORAGE_SHARED; } -static inline void bpf_cgroup_storage_set(struct bpf_cgroup_storage - *storage[MAX_BPF_CGROUP_STORAGE_TYPE]) +static inline int bpf_cgroup_storage_set(struct bpf_cgroup_storage + *storage[MAX_BPF_CGROUP_STORAGE_TYPE]) { enum bpf_cgroup_storage_type stype; + int i; + + preempt_disable(); + for (i = 0; i < BPF_CGROUP_STORAGE_NEST_MAX; i++) { + if (unlikely(this_cpu_read(bpf_cgroup_storage_info[i].task) != NULL)) + continue; + + this_cpu_write(bpf_cgroup_storage_info[i].task, current); + for_each_cgroup_storage_type(stype) + this_cpu_write(bpf_cgroup_storage_info[i].storage[stype], + storage[stype]); + break; + } + preempt_enable(); + + if (i == BPF_CGROUP_STORAGE_NEST_MAX) { + WARN_ON_ONCE(1); + return -EBUSY; + } + return 0; +} + +static inline void bpf_cgroup_storage_unset(void) +{ + int i; + + for (i = 0; i < BPF_CGROUP_STORAGE_NEST_MAX; i++) { + if (unlikely(this_cpu_read(bpf_cgroup_storage_info[i].task) != current)) + continue; - for_each_cgroup_storage_type(stype) - this_cpu_write(bpf_cgroup_storage[stype], storage[stype]); + this_cpu_write(bpf_cgroup_storage_info[i].task, NULL); + return; + } } struct bpf_cgroup_storage * @@ -448,8 +489,9 @@ static inline int cgroup_bpf_prog_query(const union bpf_attr *attr, return -EINVAL; } -static inline void bpf_cgroup_storage_set( - struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE]) {} +static inline int bpf_cgroup_storage_set( + struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE]) { return 0; } +static inline void bpf_cgroup_storage_unset(void) {} static inline int bpf_cgroup_storage_assign(struct bpf_prog_aux *aux, struct bpf_map *map) { return 0; } static inline struct bpf_cgroup_storage *bpf_cgroup_storage_alloc( diff --git a/include/linux/bpf.h b/include/linux/bpf.h index a47285cd39c2..3a6ae69743ff 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1090,6 +1090,13 @@ int bpf_prog_array_copy(struct bpf_prog_array *old_array, /* BPF program asks to set CN on the packet. */ #define BPF_RET_SET_CN (1 << 0) +/* For BPF_PROG_RUN_ARRAY_FLAGS and __BPF_PROG_RUN_ARRAY, + * if bpf_cgroup_storage_set() failed, the rest of programs + * will not execute. This should be a really rare scenario + * as it requires BPF_CGROUP_STORAGE_NEST_MAX number of + * preemptions all between bpf_cgroup_storage_set() and + * bpf_cgroup_storage_unset() on the same cpu. + */ #define BPF_PROG_RUN_ARRAY_FLAGS(array, ctx, func, ret_flags) \ ({ \ struct bpf_prog_array_item *_item; \ @@ -1102,10 +1109,12 @@ int bpf_prog_array_copy(struct bpf_prog_array *old_array, _array = rcu_dereference(array); \ _item = &_array->items[0]; \ while ((_prog = READ_ONCE(_item->prog))) { \ - bpf_cgroup_storage_set(_item->cgroup_storage); \ + if (unlikely(bpf_cgroup_storage_set(_item->cgroup_storage))) \ + break; \ func_ret = func(_prog, ctx); \ _ret &= (func_ret & 1); \ *(ret_flags) |= (func_ret >> 1); \ + bpf_cgroup_storage_unset(); \ _item++; \ } \ rcu_read_unlock(); \ @@ -1126,9 +1135,14 @@ int bpf_prog_array_copy(struct bpf_prog_array *old_array, goto _out; \ _item = &_array->items[0]; \ while ((_prog = READ_ONCE(_item->prog))) { \ - if (set_cg_storage) \ - bpf_cgroup_storage_set(_item->cgroup_storage); \ - _ret &= func(_prog, ctx); \ + if (!set_cg_storage) { \ + _ret &= func(_prog, ctx); \ + } else { \ + if (unlikely(bpf_cgroup_storage_set(_item->cgroup_storage))) \ + break; \ + _ret &= func(_prog, ctx); \ + bpf_cgroup_storage_unset(); \ + } \ _item++; \ } \ _out: \ diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 074800226327..f306611c4ddf 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -382,8 +382,8 @@ const struct bpf_func_proto bpf_get_current_ancestor_cgroup_id_proto = { }; #ifdef CONFIG_CGROUP_BPF -DECLARE_PER_CPU(struct bpf_cgroup_storage*, - bpf_cgroup_storage[MAX_BPF_CGROUP_STORAGE_TYPE]); +DECLARE_PER_CPU(struct bpf_cgroup_storage_info, + bpf_cgroup_storage_info[BPF_CGROUP_STORAGE_NEST_MAX]); BPF_CALL_2(bpf_get_local_storage, struct bpf_map *, map, u64, flags) { @@ -392,10 +392,17 @@ BPF_CALL_2(bpf_get_local_storage, struct bpf_map *, map, u64, flags) * verifier checks that its value is correct. */ enum bpf_cgroup_storage_type stype = cgroup_storage_type(map); - struct bpf_cgroup_storage *storage; + struct bpf_cgroup_storage *storage = NULL; void *ptr; + int i; - storage = this_cpu_read(bpf_cgroup_storage[stype]); + for (i = 0; i < BPF_CGROUP_STORAGE_NEST_MAX; i++) { + if (unlikely(this_cpu_read(bpf_cgroup_storage_info[i].task) != current)) + continue; + + storage = this_cpu_read(bpf_cgroup_storage_info[i].storage[stype]); + break; + } if (stype == BPF_CGROUP_STORAGE_SHARED) ptr = &READ_ONCE(storage->buf)->data[0]; diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c index 2d4f9ac12377..bd11db9774c3 100644 --- a/kernel/bpf/local_storage.c +++ b/kernel/bpf/local_storage.c @@ -9,10 +9,11 @@ #include #include -DEFINE_PER_CPU(struct bpf_cgroup_storage*, bpf_cgroup_storage[MAX_BPF_CGROUP_STORAGE_TYPE]); - #ifdef CONFIG_CGROUP_BPF +DEFINE_PER_CPU(struct bpf_cgroup_storage_info, + bpf_cgroup_storage_info[BPF_CGROUP_STORAGE_NEST_MAX]); + #include "../cgroup/cgroup-internal.h" #define LOCAL_STORAGE_CREATE_FLAG_MASK \ From patchwork Fri Mar 19 03:12:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 12149883 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28B06C4332B for ; Fri, 19 Mar 2021 03:13:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ECBD764DD2 for ; Fri, 19 Mar 2021 03:13:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233481AbhCSDMz (ORCPT ); Thu, 18 Mar 2021 23:12:55 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:37346 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231864AbhCSDMu (ORCPT ); Thu, 18 Mar 2021 23:12:50 -0400 Received: from pps.filterd (m0109334.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 12J34Snd029504 for ; Thu, 18 Mar 2021 20:12:50 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=+FwlTZK1j9LgfSFuLiWA531SLG7H/kYuJRz/AkwN104=; b=Cj/cWa5tkHMbX8MAw/U6PWC28NL9nS38XNBPqSaDGhDL9E0BwyCOQn8UjC7g92J2DajK /fSMSJY0DSFajuJ9rIWIio2+w9rX69UneWxAReYJi3OGhf8zv1rnIhf4FugQe1f6aKmm x/1y4A/Z0rugr9GrfaYLmvZz7ClQBB/jaK0= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com with ESMTP id 37cefu9jkk-10 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 18 Mar 2021 20:12:50 -0700 Received: from intmgw001.38.frc1.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::e) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 18 Mar 2021 20:12:46 -0700 Received: by devbig003.ftw2.facebook.com (Postfix, from userid 128203) id F37717A9E59; Thu, 18 Mar 2021 20:12:41 -0700 (PDT) From: Yonghong Song To: CC: Alexei Starovoitov , Daniel Borkmann , , Jiri Olsa , Roman Gushchin Subject: [PATCH bpf-next v2 2/2] bpf: fix bpf_cgroup_storage_set() usage in test_run Date: Thu, 18 Mar 2021 20:12:41 -0700 Message-ID: <20210319031241.3709930-1-yhs@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210319031231.3707705-1-yhs@fb.com> References: <20210319031231.3707705-1-yhs@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.369,18.0.761 definitions=2021-03-18_19:2021-03-17,2021-03-18 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 lowpriorityscore=0 impostorscore=0 mlxscore=0 malwarescore=0 suspectscore=0 mlxlogscore=711 adultscore=0 bulkscore=0 priorityscore=1501 clxscore=1015 phishscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2103190021 X-FB-Internal: deliver Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net In bpf_test_run(), check the return value of bpf_cgroup_storage_set() and do bpf_cgroup_storate_unset() properly. Cc: Jiri Olsa Cc: Roman Gushchin Signed-off-by: Yonghong Song --- net/bpf/test_run.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index 0abdd67f44b1..4aabf71cd95d 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -106,12 +106,16 @@ static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat, bpf_test_timer_enter(&t); do { - bpf_cgroup_storage_set(storage); + ret = bpf_cgroup_storage_set(storage); + if (ret) + break; if (xdp) *retval = bpf_prog_run_xdp(prog, ctx); else *retval = BPF_PROG_RUN(prog, ctx); + + bpf_cgroup_storage_unset(); } while (bpf_test_timer_continue(&t, repeat, &ret, time)); bpf_test_timer_leave(&t);