Message ID | 20220809213033.24147-3-memxor@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 275c30bcee66a27d1aa97a215d607ad6d49804cb |
Delegated to: | BPF |
Headers | show |
Series | Don't reinit map value in prealloc_lru_pop | expand |
On Tue, Aug 09, 2022 at 11:30:32PM +0200, Kumar Kartikeya Dwivedi wrote: > The LRU map that is preallocated may have its elements reused while > another program holds a pointer to it from bpf_map_lookup_elem. Hence, > only check_and_free_fields is appropriate when the element is being > deleted, as it ensures proper synchronization against concurrent access > of the map value. After that, we cannot call check_and_init_map_value > again as it may rewrite bpf_spin_lock, bpf_timer, and kptr fields while > they can be concurrently accessed from a BPF program. > > This is safe to do as when the map entry is deleted, concurrent access > is protected against by check_and_free_fields, i.e. an existing timer > would be freed, and any existing kptr will be released by it. The > program can create further timers and kptrs after check_and_free_fields, > but they will eventually be released once the preallocated items are > freed on map destruction, even if the item is never reused again. Hence, > the deleted item sitting in the free list can still have resources > attached to it, and they would never leak. > > With spin_lock, we never touch the field at all on delete or update, as > we may end up modifying the state of the lock. Since the verifier > ensures that a bpf_spin_lock call is always paired with bpf_spin_unlock > call, the program will eventually release the lock so that on reuse the > new user of the value can take the lock. The bpf_spin_lock's verifier description makes sense. Note that the lru map does not support spin lock for now. > > Essentially, for the preallocated case, we must assume that the map > value may always be in use by the program, even when it is sitting in > the freelist, and handle things accordingly, i.e. use proper > synchronization inside check_and_free_fields, and never reinitialize the > special fields when it is reused on update. Acked-by: Martin KaFai Lau <kafai@fb.com>
On Tue, Aug 9, 2022 at 3:29 PM Martin KaFai Lau <kafai@fb.com> wrote: > > On Tue, Aug 09, 2022 at 11:30:32PM +0200, Kumar Kartikeya Dwivedi wrote: > > The LRU map that is preallocated may have its elements reused while > > another program holds a pointer to it from bpf_map_lookup_elem. Hence, > > only check_and_free_fields is appropriate when the element is being > > deleted, as it ensures proper synchronization against concurrent access > > of the map value. After that, we cannot call check_and_init_map_value > > again as it may rewrite bpf_spin_lock, bpf_timer, and kptr fields while > > they can be concurrently accessed from a BPF program. > > > > This is safe to do as when the map entry is deleted, concurrent access > > is protected against by check_and_free_fields, i.e. an existing timer > > would be freed, and any existing kptr will be released by it. The > > program can create further timers and kptrs after check_and_free_fields, > > but they will eventually be released once the preallocated items are > > freed on map destruction, even if the item is never reused again. Hence, > > the deleted item sitting in the free list can still have resources > > attached to it, and they would never leak. > > > > With spin_lock, we never touch the field at all on delete or update, as > > we may end up modifying the state of the lock. Since the verifier > > ensures that a bpf_spin_lock call is always paired with bpf_spin_unlock > > call, the program will eventually release the lock so that on reuse the > > new user of the value can take the lock. > The bpf_spin_lock's verifier description makes sense. Note that > the lru map does not support spin lock for now. ahh. then it's not a bpf tree material. It's a minor cleanup for bpf-next? > > > > Essentially, for the preallocated case, we must assume that the map > > value may always be in use by the program, even when it is sitting in > > the freelist, and handle things accordingly, i.e. use proper > > synchronization inside check_and_free_fields, and never reinitialize the > > special fields when it is reused on update. > Acked-by: Martin KaFai Lau <kafai@fb.com>
On Wed, 10 Aug 2022 at 02:50, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Tue, Aug 9, 2022 at 3:29 PM Martin KaFai Lau <kafai@fb.com> wrote: > > > > On Tue, Aug 09, 2022 at 11:30:32PM +0200, Kumar Kartikeya Dwivedi wrote: > > > The LRU map that is preallocated may have its elements reused while > > > another program holds a pointer to it from bpf_map_lookup_elem. Hence, > > > only check_and_free_fields is appropriate when the element is being > > > deleted, as it ensures proper synchronization against concurrent access > > > of the map value. After that, we cannot call check_and_init_map_value > > > again as it may rewrite bpf_spin_lock, bpf_timer, and kptr fields while > > > they can be concurrently accessed from a BPF program. > > > > > > This is safe to do as when the map entry is deleted, concurrent access > > > is protected against by check_and_free_fields, i.e. an existing timer > > > would be freed, and any existing kptr will be released by it. The > > > program can create further timers and kptrs after check_and_free_fields, > > > but they will eventually be released once the preallocated items are > > > freed on map destruction, even if the item is never reused again. Hence, > > > the deleted item sitting in the free list can still have resources > > > attached to it, and they would never leak. > > > > > > With spin_lock, we never touch the field at all on delete or update, as > > > we may end up modifying the state of the lock. Since the verifier > > > ensures that a bpf_spin_lock call is always paired with bpf_spin_unlock > > > call, the program will eventually release the lock so that on reuse the > > > new user of the value can take the lock. > > The bpf_spin_lock's verifier description makes sense. Note that > > the lru map does not support spin lock for now. > > ahh. then it's not a bpf tree material. > It's a minor cleanup for bpf-next? > I was just describing what we do for each of the three types in the commit log. It still affects timers and kptrs, which lru map supports. > > > > > > Essentially, for the preallocated case, we must assume that the map > > > value may always be in use by the program, even when it is sitting in > > > the freelist, and handle things accordingly, i.e. use proper > > > synchronization inside check_and_free_fields, and never reinitialize the > > > special fields when it is reused on update. > > Acked-by: Martin KaFai Lau <kafai@fb.com>
On Tue, Aug 9, 2022 at 5:52 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > On Wed, 10 Aug 2022 at 02:50, Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Tue, Aug 9, 2022 at 3:29 PM Martin KaFai Lau <kafai@fb.com> wrote: > > > > > > On Tue, Aug 09, 2022 at 11:30:32PM +0200, Kumar Kartikeya Dwivedi wrote: > > > > The LRU map that is preallocated may have its elements reused while > > > > another program holds a pointer to it from bpf_map_lookup_elem. Hence, > > > > only check_and_free_fields is appropriate when the element is being > > > > deleted, as it ensures proper synchronization against concurrent access > > > > of the map value. After that, we cannot call check_and_init_map_value > > > > again as it may rewrite bpf_spin_lock, bpf_timer, and kptr fields while > > > > they can be concurrently accessed from a BPF program. > > > > > > > > This is safe to do as when the map entry is deleted, concurrent access > > > > is protected against by check_and_free_fields, i.e. an existing timer > > > > would be freed, and any existing kptr will be released by it. The > > > > program can create further timers and kptrs after check_and_free_fields, > > > > but they will eventually be released once the preallocated items are > > > > freed on map destruction, even if the item is never reused again. Hence, > > > > the deleted item sitting in the free list can still have resources > > > > attached to it, and they would never leak. > > > > > > > > With spin_lock, we never touch the field at all on delete or update, as > > > > we may end up modifying the state of the lock. Since the verifier > > > > ensures that a bpf_spin_lock call is always paired with bpf_spin_unlock > > > > call, the program will eventually release the lock so that on reuse the > > > > new user of the value can take the lock. > > > The bpf_spin_lock's verifier description makes sense. Note that > > > the lru map does not support spin lock for now. > > > > ahh. then it's not a bpf tree material. > > It's a minor cleanup for bpf-next? > > > > I was just describing what we do for each of the three types in the > commit log. It still affects timers and kptrs, which lru map supports. got it. The uaf-ing cpu can use kptr field of deleted elem and things go bad it lru_pop-ing cpu will clear it.
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index da7578426a46..4d793a92301b 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -311,12 +311,8 @@ static struct htab_elem *prealloc_lru_pop(struct bpf_htab *htab, void *key, struct htab_elem *l; if (node) { - u32 key_size = htab->map.key_size; - l = container_of(node, struct htab_elem, lru_node); - memcpy(l->key, key, key_size); - check_and_init_map_value(&htab->map, - l->key + round_up(key_size, 8)); + memcpy(l->key, key, htab->map.key_size); return l; }