Message ID | 20230918093620.3479627-1-make_ruc2021@163.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | bpf, sockmap: fix deadlocks in the sockhash and sockmap | expand |
On 9/18/23 02:36, Ma Ke wrote: > It seems that elements in sockhash are rarely actively > deleted by users or ebpf program. Therefore, we do not > pay much attention to their deletion. Compared with hash > maps, sockhash only provides spin_lock_bh protection. > This causes it to appear to have self-locking behavior > in the interrupt context, as CVE-2023-0160 points out. > > Signed-off-by: Ma Ke <make_ruc2021@163.com> > --- > net/core/sock_map.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/net/core/sock_map.c b/net/core/sock_map.c > index cb11750b1df5..1302d484e769 100644 > --- a/net/core/sock_map.c > +++ b/net/core/sock_map.c > @@ -928,11 +928,12 @@ static long sock_hash_delete_elem(struct bpf_map *map, void *key) > struct bpf_shtab_bucket *bucket; > struct bpf_shtab_elem *elem; > int ret = -ENOENT; > + unsigned long flags; Keep reverse xmas tree ordering? > > hash = sock_hash_bucket_hash(key, key_size); > bucket = sock_hash_select_bucket(htab, hash); > > - spin_lock_bh(&bucket->lock); > + spin_lock_irqsave(&bucket->lock, flags); > elem = sock_hash_lookup_elem_raw(&bucket->head, hash, key, key_size); > if (elem) { > hlist_del_rcu(&elem->node); > @@ -940,7 +941,7 @@ static long sock_hash_delete_elem(struct bpf_map *map, void *key) > sock_hash_free_elem(htab, elem); > ret = 0; > } > - spin_unlock_bh(&bucket->lock); > + spin_unlock_irqrestore(&bucket->lock, flags); > return ret; > } >
Kui-Feng Lee wrote: > > > On 9/18/23 02:36, Ma Ke wrote: > > It seems that elements in sockhash are rarely actively > > deleted by users or ebpf program. Therefore, we do not We never delete them in our usage. I think soon we will have support to run BPF programs without a map at all removing these concerns for many use cases. > > pay much attention to their deletion. Compared with hash > > maps, sockhash only provides spin_lock_bh protection. > > This causes it to appear to have self-locking behavior > > in the interrupt context, as CVE-2023-0160 points out. CVE is a bit exagerrated in my opinion. I'm not sure why anyone would delete an element from interrupt context. But, OK if someone wrote such a thing we shouldn't lock up. > > > > Signed-off-by: Ma Ke <make_ruc2021@163.com> > > --- > > net/core/sock_map.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/net/core/sock_map.c b/net/core/sock_map.c > > index cb11750b1df5..1302d484e769 100644 > > --- a/net/core/sock_map.c > > +++ b/net/core/sock_map.c > > @@ -928,11 +928,12 @@ static long sock_hash_delete_elem(struct bpf_map *map, void *key) > > struct bpf_shtab_bucket *bucket; > > struct bpf_shtab_elem *elem; > > int ret = -ENOENT; > > + unsigned long flags; > > Keep reverse xmas tree ordering? > > > > > hash = sock_hash_bucket_hash(key, key_size); > > bucket = sock_hash_select_bucket(htab, hash); > > > > - spin_lock_bh(&bucket->lock); > > + spin_lock_irqsave(&bucket->lock, flags); The hashtab code htab_lock_bucket also does a preempt_disable() followed by raw_spin_lock_irqsave(). Do we need this as well to handle the PREEMPT_CONFIG cases. I'll also take a look, but figured I would post the question given I wont likely get time to check until tonight/tomorrow. Also converting to irqsave before ran into syzbot crash wont this do the same? > > elem = sock_hash_lookup_elem_raw(&bucket->head, hash, key, key_size); > > if (elem) { > > hlist_del_rcu(&elem->node); > > @@ -940,7 +941,7 @@ static long sock_hash_delete_elem(struct bpf_map *map, void *key) > > sock_hash_free_elem(htab, elem); > > ret = 0; > > } > > - spin_unlock_bh(&bucket->lock); > > + spin_unlock_irqrestore(&bucket->lock, flags); > > return ret; > > } > >
On 9/20/23 11:07 AM, John Fastabend wrote: >>> pay much attention to their deletion. Compared with hash >>> maps, sockhash only provides spin_lock_bh protection. >>> This causes it to appear to have self-locking behavior >>> in the interrupt context, as CVE-2023-0160 points out. > > CVE is a bit exagerrated in my opinion. I'm not sure why > anyone would delete an element from interrupt context. But, > OK if someone wrote such a thing we shouldn't lock up. This should only happen in tracing program? not sure if it will be too drastic to disallow tracing program to use bpf_map_delete_elem during load time now. A followup question, if sockmap can be accessed from tracing program, does it need an in_nmi() check? >>> hash = sock_hash_bucket_hash(key, key_size); >>> bucket = sock_hash_select_bucket(htab, hash); >>> >>> - spin_lock_bh(&bucket->lock); >>> + spin_lock_irqsave(&bucket->lock, flags); > > The hashtab code htab_lock_bucket also does a preempt_disable() > followed by raw_spin_lock_irqsave(). Do we need this as well > to handle the PREEMPT_CONFIG cases. iirc, preempt_disable in htab is for the CONFIG_PREEMPT but it is for the __this_cpu_inc_return to avoid unnecessary lock failure due to preemption, so probably it is not needed here. The commit 2775da216287 ("bpf: Disable preemption when increasing per-cpu map_locked") If map_delete can be called from any tracing context, the raw_spin_lock_xxx version is probably needed though. Otherwise, splat (e.g. PROVE_RAW_LOCK_NESTING) could be triggered.
Martin KaFai Lau wrote: > On 9/20/23 11:07 AM, John Fastabend wrote: > >>> pay much attention to their deletion. Compared with hash > >>> maps, sockhash only provides spin_lock_bh protection. > >>> This causes it to appear to have self-locking behavior > >>> in the interrupt context, as CVE-2023-0160 points out. > > > > CVE is a bit exagerrated in my opinion. I'm not sure why > > anyone would delete an element from interrupt context. But, > > OK if someone wrote such a thing we shouldn't lock up. > > This should only happen in tracing program? > not sure if it will be too drastic to disallow tracing program to use > bpf_map_delete_elem during load time now. I don't think we have any users from tracing programs, but might be something out there? > > A followup question, if sockmap can be accessed from tracing program, does it > need an in_nmi() check? I think we could just do 'in_nmi(); return EOPNOTSUPP;' > > >>> hash = sock_hash_bucket_hash(key, key_size); > >>> bucket = sock_hash_select_bucket(htab, hash); > >>> > >>> - spin_lock_bh(&bucket->lock); > >>> + spin_lock_irqsave(&bucket->lock, flags); > > > > The hashtab code htab_lock_bucket also does a preempt_disable() > > followed by raw_spin_lock_irqsave(). Do we need this as well > > to handle the PREEMPT_CONFIG cases. > > iirc, preempt_disable in htab is for the CONFIG_PREEMPT but it is for the > __this_cpu_inc_return to avoid unnecessary lock failure due to preemption, so > probably it is not needed here. The commit 2775da216287 ("bpf: Disable > preemption when increasing per-cpu map_locked") > > If map_delete can be called from any tracing context, the raw_spin_lock_xxx > version is probably needed though. Otherwise, splat (e.g. > PROVE_RAW_LOCK_NESTING) could be triggered. Yep. I'll look at it I guess. We should probably either block access from tracing programs or add some tests.
diff --git a/net/core/sock_map.c b/net/core/sock_map.c index cb11750b1df5..1302d484e769 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -928,11 +928,12 @@ static long sock_hash_delete_elem(struct bpf_map *map, void *key) struct bpf_shtab_bucket *bucket; struct bpf_shtab_elem *elem; int ret = -ENOENT; + unsigned long flags; hash = sock_hash_bucket_hash(key, key_size); bucket = sock_hash_select_bucket(htab, hash); - spin_lock_bh(&bucket->lock); + spin_lock_irqsave(&bucket->lock, flags); elem = sock_hash_lookup_elem_raw(&bucket->head, hash, key, key_size); if (elem) { hlist_del_rcu(&elem->node); @@ -940,7 +941,7 @@ static long sock_hash_delete_elem(struct bpf_map *map, void *key) sock_hash_free_elem(htab, elem); ret = 0; } - spin_unlock_bh(&bucket->lock); + spin_unlock_irqrestore(&bucket->lock, flags); return ret; }
It seems that elements in sockhash are rarely actively deleted by users or ebpf program. Therefore, we do not pay much attention to their deletion. Compared with hash maps, sockhash only provides spin_lock_bh protection. This causes it to appear to have self-locking behavior in the interrupt context, as CVE-2023-0160 points out. Signed-off-by: Ma Ke <make_ruc2021@163.com> --- net/core/sock_map.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)