Message ID | 20240319175406.2940628-1-yonghong.song@linux.dev (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | bpf: Add bpf_link support for sk_msg and sk_skb progs | expand |
Hi Yonghong, kernel test robot noticed the following build warnings: [auto build test WARNING on bpf-next/master] url: https://github.com/intel-lab-lkp/linux/commits/Yonghong-Song/bpf-Add-bpf_link-support-for-sk_msg-and-sk_skb-progs/20240320-015917 base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master patch link: https://lore.kernel.org/r/20240319175406.2940628-1-yonghong.song%40linux.dev patch subject: [PATCH bpf-next v2 1/6] bpf: Add bpf_link support for sk_msg and sk_skb progs config: x86_64-randconfig-011-20240321 (https://download.01.org/0day-ci/archive/20240321/202403211543.wg8VXMiO-lkp@intel.com/config) compiler: clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240321/202403211543.wg8VXMiO-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202403211543.wg8VXMiO-lkp@intel.com/ All warnings (new ones prefixed by >>): net/core/skmsg.c:1279:9: error: call to undeclared function 'sock_map_prog_update'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 1279 | (void)sock_map_prog_update(sk_link->map, NULL, link->prog, | ^ net/core/skmsg.c:1281:3: error: call to undeclared function 'bpf_map_put_with_uref'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 1281 | bpf_map_put_with_uref(sk_link->map); | ^ net/core/skmsg.c:1316:8: error: call to undeclared function 'sock_map_prog_update'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 1316 | ret = sock_map_prog_update(sk_link->map, new_prog, old_prog, | ^ net/core/skmsg.c:1383:8: error: call to undeclared function 'bpf_map_get_with_uref'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 1383 | map = bpf_map_get_with_uref(attr->link_create.target_fd); | ^ net/core/skmsg.c:1383:6: error: incompatible integer to pointer conversion assigning to 'struct bpf_map *' from 'int' [-Wint-conversion] 1383 | map = bpf_map_get_with_uref(attr->link_create.target_fd); | ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ net/core/skmsg.c:1409:8: error: call to undeclared function 'sock_map_prog_update'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 1409 | ret = sock_map_prog_update(map, prog, NULL, attach_type); | ^ net/core/skmsg.c:1420:2: error: call to undeclared function 'bpf_map_put_with_uref'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 1420 | bpf_map_put_with_uref(map); | ^ >> net/core/skmsg.c:1371:5: warning: no previous prototype for function 'bpf_sk_msg_skb_link_create' [-Wmissing-prototypes] 1371 | int bpf_sk_msg_skb_link_create(const union bpf_attr *attr, struct bpf_prog *prog) | ^ net/core/skmsg.c:1371:1: note: declare 'static' if the function is not intended to be used outside of this translation unit 1371 | int bpf_sk_msg_skb_link_create(const union bpf_attr *attr, struct bpf_prog *prog) | ^ | static 1 warning and 7 errors generated. Kconfig warnings: (for reference only) WARNING: unmet direct dependencies detected for DRM_I915_DEBUG_GEM Depends on [n]: HAS_IOMEM [=y] && DRM_I915 [=y] && EXPERT [=y] && DRM_I915_WERROR [=n] Selected by [y]: - DRM_I915_DEBUG [=y] && HAS_IOMEM [=y] && DRM_I915 [=y] && EXPERT [=y] && !COMPILE_TEST [=n] vim +/bpf_sk_msg_skb_link_create +1371 net/core/skmsg.c 1370 > 1371 int bpf_sk_msg_skb_link_create(const union bpf_attr *attr, struct bpf_prog *prog) 1372 { 1373 struct bpf_link_primer link_primer; 1374 struct bpf_sk_msg_skb_link *sk_link; 1375 enum bpf_attach_type attach_type; 1376 enum bpf_link_type link_type; 1377 struct bpf_map *map; 1378 int ret; 1379 1380 if (attr->link_create.flags) 1381 return -EINVAL; 1382 > 1383 map = bpf_map_get_with_uref(attr->link_create.target_fd);
Hi Yonghong, kernel test robot noticed the following build errors: [auto build test ERROR on bpf-next/master] url: https://github.com/intel-lab-lkp/linux/commits/Yonghong-Song/bpf-Add-bpf_link-support-for-sk_msg-and-sk_skb-progs/20240320-015917 base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master patch link: https://lore.kernel.org/r/20240319175406.2940628-1-yonghong.song%40linux.dev patch subject: [PATCH bpf-next v2 1/6] bpf: Add bpf_link support for sk_msg and sk_skb progs config: x86_64-randconfig-011-20240321 (https://download.01.org/0day-ci/archive/20240322/202403220430.affB6tuK-lkp@intel.com/config) compiler: clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240322/202403220430.affB6tuK-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202403220430.affB6tuK-lkp@intel.com/ All errors (new ones prefixed by >>): >> net/core/skmsg.c:1279:9: error: call to undeclared function 'sock_map_prog_update'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 1279 | (void)sock_map_prog_update(sk_link->map, NULL, link->prog, | ^ >> net/core/skmsg.c:1281:3: error: call to undeclared function 'bpf_map_put_with_uref'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 1281 | bpf_map_put_with_uref(sk_link->map); | ^ net/core/skmsg.c:1316:8: error: call to undeclared function 'sock_map_prog_update'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 1316 | ret = sock_map_prog_update(sk_link->map, new_prog, old_prog, | ^ >> net/core/skmsg.c:1383:8: error: call to undeclared function 'bpf_map_get_with_uref'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 1383 | map = bpf_map_get_with_uref(attr->link_create.target_fd); | ^ >> net/core/skmsg.c:1383:6: error: incompatible integer to pointer conversion assigning to 'struct bpf_map *' from 'int' [-Wint-conversion] 1383 | map = bpf_map_get_with_uref(attr->link_create.target_fd); | ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ net/core/skmsg.c:1409:8: error: call to undeclared function 'sock_map_prog_update'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 1409 | ret = sock_map_prog_update(map, prog, NULL, attach_type); | ^ net/core/skmsg.c:1420:2: error: call to undeclared function 'bpf_map_put_with_uref'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 1420 | bpf_map_put_with_uref(map); | ^ net/core/skmsg.c:1371:5: warning: no previous prototype for function 'bpf_sk_msg_skb_link_create' [-Wmissing-prototypes] 1371 | int bpf_sk_msg_skb_link_create(const union bpf_attr *attr, struct bpf_prog *prog) | ^ net/core/skmsg.c:1371:1: note: declare 'static' if the function is not intended to be used outside of this translation unit 1371 | int bpf_sk_msg_skb_link_create(const union bpf_attr *attr, struct bpf_prog *prog) | ^ | static 1 warning and 7 errors generated. Kconfig warnings: (for reference only) WARNING: unmet direct dependencies detected for DRM_I915_DEBUG_GEM Depends on [n]: HAS_IOMEM [=y] && DRM_I915 [=y] && EXPERT [=y] && DRM_I915_WERROR [=n] Selected by [y]: - DRM_I915_DEBUG [=y] && HAS_IOMEM [=y] && DRM_I915 [=y] && EXPERT [=y] && !COMPILE_TEST [=n] vim +/sock_map_prog_update +1279 net/core/skmsg.c 1272 1273 static void bpf_sk_msg_skb_link_release(struct bpf_link *link) 1274 { 1275 struct bpf_sk_msg_skb_link *sk_link = bpf_sk_msg_skb_link(link); 1276 1277 mutex_lock(&link_mutex); 1278 if (sk_link->map) { > 1279 (void)sock_map_prog_update(sk_link->map, NULL, link->prog, 1280 sk_link->attach_type); > 1281 bpf_map_put_with_uref(sk_link->map); 1282 sk_link->map = NULL; 1283 } 1284 mutex_unlock(&link_mutex); 1285 } 1286 1287 static int bpf_sk_msg_skb_link_detach(struct bpf_link *link) 1288 { 1289 bpf_sk_msg_skb_link_release(link); 1290 return 0; 1291 } 1292 1293 static void bpf_sk_msg_skb_link_dealloc(struct bpf_link *link) 1294 { 1295 kfree(bpf_sk_msg_skb_link(link)); 1296 } 1297 1298 static int bpf_sk_msg_skb_link_update_prog(struct bpf_link *link, 1299 struct bpf_prog *new_prog, 1300 struct bpf_prog *old_prog) 1301 { 1302 const struct bpf_sk_msg_skb_link *sk_link = bpf_sk_msg_skb_link(link); 1303 int ret = 0; 1304 1305 mutex_lock(&link_mutex); 1306 if (old_prog && link->prog != old_prog) { 1307 ret = -EPERM; 1308 goto out; 1309 } 1310 1311 if (link->prog->type != new_prog->type) { 1312 ret = -EINVAL; 1313 goto out; 1314 } 1315 1316 ret = sock_map_prog_update(sk_link->map, new_prog, old_prog, 1317 sk_link->attach_type); 1318 if (!ret) 1319 bpf_prog_inc(new_prog); 1320 1321 out: 1322 mutex_unlock(&link_mutex); 1323 return ret; 1324 } 1325 1326 static int bpf_sk_msg_skb_link_fill_info(const struct bpf_link *link, 1327 struct bpf_link_info *info) 1328 { 1329 const struct bpf_sk_msg_skb_link *sk_link = bpf_sk_msg_skb_link(link); 1330 u32 map_id = 0; 1331 1332 mutex_lock(&link_mutex); 1333 if (sk_link->map) 1334 map_id = sk_link->map->id; 1335 mutex_unlock(&link_mutex); 1336 1337 if (link->type == BPF_LINK_TYPE_SK_MSG) { 1338 info->skmsg.map_id = map_id; 1339 info->skmsg.attach_type = sk_link->attach_type; 1340 } else { 1341 info->skskb.map_id = map_id; 1342 info->skskb.attach_type = sk_link->attach_type; 1343 } 1344 return 0; 1345 } 1346 1347 static void bpf_sk_msg_skb_link_show_fdinfo(const struct bpf_link *link, 1348 struct seq_file *seq) 1349 { 1350 const struct bpf_sk_msg_skb_link *sk_link = bpf_sk_msg_skb_link(link); 1351 u32 map_id = 0; 1352 1353 mutex_lock(&link_mutex); 1354 if (sk_link->map) 1355 map_id = sk_link->map->id; 1356 mutex_unlock(&link_mutex); 1357 1358 seq_printf(seq, "map_id:\t%u\n", map_id); 1359 seq_printf(seq, "attach_type:\t%u (...)\n", sk_link->attach_type); 1360 } 1361 1362 static const struct bpf_link_ops bpf_sk_msg_skb_link_ops = { 1363 .release = bpf_sk_msg_skb_link_release, 1364 .dealloc = bpf_sk_msg_skb_link_dealloc, 1365 .detach = bpf_sk_msg_skb_link_detach, 1366 .update_prog = bpf_sk_msg_skb_link_update_prog, 1367 .fill_link_info = bpf_sk_msg_skb_link_fill_info, 1368 .show_fdinfo = bpf_sk_msg_skb_link_show_fdinfo, 1369 }; 1370 1371 int bpf_sk_msg_skb_link_create(const union bpf_attr *attr, struct bpf_prog *prog) 1372 { 1373 struct bpf_link_primer link_primer; 1374 struct bpf_sk_msg_skb_link *sk_link; 1375 enum bpf_attach_type attach_type; 1376 enum bpf_link_type link_type; 1377 struct bpf_map *map; 1378 int ret; 1379 1380 if (attr->link_create.flags) 1381 return -EINVAL; 1382 > 1383 map = bpf_map_get_with_uref(attr->link_create.target_fd);
Hi Yonghong, kernel test robot noticed the following build errors: [auto build test ERROR on bpf-next/master] url: https://github.com/intel-lab-lkp/linux/commits/Yonghong-Song/bpf-Add-bpf_link-support-for-sk_msg-and-sk_skb-progs/20240320-015917 base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master patch link: https://lore.kernel.org/r/20240319175406.2940628-1-yonghong.song%40linux.dev patch subject: [PATCH bpf-next v2 1/6] bpf: Add bpf_link support for sk_msg and sk_skb progs config: nios2-randconfig-001-20240321 (https://download.01.org/0day-ci/archive/20240322/202403220545.KEzjr9hI-lkp@intel.com/config) compiler: nios2-linux-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240322/202403220545.KEzjr9hI-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202403220545.KEzjr9hI-lkp@intel.com/ All error/warnings (new ones prefixed by >>): net/core/skmsg.c: In function 'bpf_sk_msg_skb_link_release': >> net/core/skmsg.c:1279:23: error: implicit declaration of function 'sock_map_prog_update' [-Werror=implicit-function-declaration] 1279 | (void)sock_map_prog_update(sk_link->map, NULL, link->prog, | ^~~~~~~~~~~~~~~~~~~~ >> net/core/skmsg.c:1281:17: error: implicit declaration of function 'bpf_map_put_with_uref' [-Werror=implicit-function-declaration] 1281 | bpf_map_put_with_uref(sk_link->map); | ^~~~~~~~~~~~~~~~~~~~~ net/core/skmsg.c: At top level: >> net/core/skmsg.c:1371:5: warning: no previous prototype for 'bpf_sk_msg_skb_link_create' [-Wmissing-prototypes] 1371 | int bpf_sk_msg_skb_link_create(const union bpf_attr *attr, struct bpf_prog *prog) | ^~~~~~~~~~~~~~~~~~~~~~~~~~ net/core/skmsg.c: In function 'bpf_sk_msg_skb_link_create': >> net/core/skmsg.c:1383:15: error: implicit declaration of function 'bpf_map_get_with_uref' [-Werror=implicit-function-declaration] 1383 | map = bpf_map_get_with_uref(attr->link_create.target_fd); | ^~~~~~~~~~~~~~~~~~~~~ >> net/core/skmsg.c:1383:13: warning: assignment to 'struct bpf_map *' from 'int' makes pointer from integer without a cast [-Wint-conversion] 1383 | map = bpf_map_get_with_uref(attr->link_create.target_fd); | ^ cc1: some warnings being treated as errors vim +/sock_map_prog_update +1279 net/core/skmsg.c 1272 1273 static void bpf_sk_msg_skb_link_release(struct bpf_link *link) 1274 { 1275 struct bpf_sk_msg_skb_link *sk_link = bpf_sk_msg_skb_link(link); 1276 1277 mutex_lock(&link_mutex); 1278 if (sk_link->map) { > 1279 (void)sock_map_prog_update(sk_link->map, NULL, link->prog, 1280 sk_link->attach_type); > 1281 bpf_map_put_with_uref(sk_link->map); 1282 sk_link->map = NULL; 1283 } 1284 mutex_unlock(&link_mutex); 1285 } 1286 1287 static int bpf_sk_msg_skb_link_detach(struct bpf_link *link) 1288 { 1289 bpf_sk_msg_skb_link_release(link); 1290 return 0; 1291 } 1292 1293 static void bpf_sk_msg_skb_link_dealloc(struct bpf_link *link) 1294 { 1295 kfree(bpf_sk_msg_skb_link(link)); 1296 } 1297 1298 static int bpf_sk_msg_skb_link_update_prog(struct bpf_link *link, 1299 struct bpf_prog *new_prog, 1300 struct bpf_prog *old_prog) 1301 { 1302 const struct bpf_sk_msg_skb_link *sk_link = bpf_sk_msg_skb_link(link); 1303 int ret = 0; 1304 1305 mutex_lock(&link_mutex); 1306 if (old_prog && link->prog != old_prog) { 1307 ret = -EPERM; 1308 goto out; 1309 } 1310 1311 if (link->prog->type != new_prog->type) { 1312 ret = -EINVAL; 1313 goto out; 1314 } 1315 1316 ret = sock_map_prog_update(sk_link->map, new_prog, old_prog, 1317 sk_link->attach_type); 1318 if (!ret) 1319 bpf_prog_inc(new_prog); 1320 1321 out: 1322 mutex_unlock(&link_mutex); 1323 return ret; 1324 } 1325 1326 static int bpf_sk_msg_skb_link_fill_info(const struct bpf_link *link, 1327 struct bpf_link_info *info) 1328 { 1329 const struct bpf_sk_msg_skb_link *sk_link = bpf_sk_msg_skb_link(link); 1330 u32 map_id = 0; 1331 1332 mutex_lock(&link_mutex); 1333 if (sk_link->map) 1334 map_id = sk_link->map->id; 1335 mutex_unlock(&link_mutex); 1336 1337 if (link->type == BPF_LINK_TYPE_SK_MSG) { 1338 info->skmsg.map_id = map_id; 1339 info->skmsg.attach_type = sk_link->attach_type; 1340 } else { 1341 info->skskb.map_id = map_id; 1342 info->skskb.attach_type = sk_link->attach_type; 1343 } 1344 return 0; 1345 } 1346 1347 static void bpf_sk_msg_skb_link_show_fdinfo(const struct bpf_link *link, 1348 struct seq_file *seq) 1349 { 1350 const struct bpf_sk_msg_skb_link *sk_link = bpf_sk_msg_skb_link(link); 1351 u32 map_id = 0; 1352 1353 mutex_lock(&link_mutex); 1354 if (sk_link->map) 1355 map_id = sk_link->map->id; 1356 mutex_unlock(&link_mutex); 1357 1358 seq_printf(seq, "map_id:\t%u\n", map_id); 1359 seq_printf(seq, "attach_type:\t%u (...)\n", sk_link->attach_type); 1360 } 1361 1362 static const struct bpf_link_ops bpf_sk_msg_skb_link_ops = { 1363 .release = bpf_sk_msg_skb_link_release, 1364 .dealloc = bpf_sk_msg_skb_link_dealloc, 1365 .detach = bpf_sk_msg_skb_link_detach, 1366 .update_prog = bpf_sk_msg_skb_link_update_prog, 1367 .fill_link_info = bpf_sk_msg_skb_link_fill_info, 1368 .show_fdinfo = bpf_sk_msg_skb_link_show_fdinfo, 1369 }; 1370 > 1371 int bpf_sk_msg_skb_link_create(const union bpf_attr *attr, struct bpf_prog *prog) 1372 { 1373 struct bpf_link_primer link_primer; 1374 struct bpf_sk_msg_skb_link *sk_link; 1375 enum bpf_attach_type attach_type; 1376 enum bpf_link_type link_type; 1377 struct bpf_map *map; 1378 int ret; 1379 1380 if (attr->link_create.flags) 1381 return -EINVAL; 1382 > 1383 map = bpf_map_get_with_uref(attr->link_create.target_fd);
On Tue, Mar 19, 2024 at 10:54 AM Yonghong Song <yonghong.song@linux.dev> wrote: > > Add bpf_link support for sk_msg and sk_skb programs. We have an > internal request to support bpf_link for sk_msg programs so user > space can have a uniform handling with bpf_link based libbpf > APIs. Using bpf_link based libbpf API also has a benefit which > makes system robust by decoupling prog life cycle and > attachment life cycle. > > Signed-off-by: Yonghong Song <yonghong.song@linux.dev> > --- > include/linux/bpf.h | 13 +++ > include/uapi/linux/bpf.h | 10 ++ > kernel/bpf/syscall.c | 4 + > net/core/skmsg.c | 164 +++++++++++++++++++++++++++++++++ > net/core/sock_map.c | 6 +- > tools/include/uapi/linux/bpf.h | 10 ++ > 6 files changed, 203 insertions(+), 4 deletions(-) > [...] > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 3c42b9f1bada..c5506cfca4f8 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -1135,6 +1135,8 @@ enum bpf_link_type { > BPF_LINK_TYPE_TCX = 11, > BPF_LINK_TYPE_UPROBE_MULTI = 12, > BPF_LINK_TYPE_NETKIT = 13, > + BPF_LINK_TYPE_SK_MSG = 14, > + BPF_LINK_TYPE_SK_SKB = 15, they are both "sockmap attachments", so maybe we should just have something like BPF_LINK_TYPE_SOCKMAP ? > __MAX_BPF_LINK_TYPE, > }; > > @@ -6718,6 +6720,14 @@ struct bpf_link_info { > __u32 ifindex; > __u32 attach_type; > } netkit; > + struct { > + __u32 map_id; > + __u32 attach_type; > + } skmsg; > + struct { > + __u32 map_id; > + __u32 attach_type; > + } skskb; and then this would be also just one struct, instead of two identical ones duplicated > }; > } __attribute__((aligned(8))); > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index ae2ff73bde7e..3d13eec5a30d 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -5213,6 +5213,10 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr) > case BPF_PROG_TYPE_SK_LOOKUP: > ret = netns_bpf_link_create(attr, prog); > break; > + case BPF_PROG_TYPE_SK_MSG: > + case BPF_PROG_TYPE_SK_SKB: > + ret = bpf_sk_msg_skb_link_create(attr, prog); > + break; > #ifdef CONFIG_NET > case BPF_PROG_TYPE_XDP: > ret = bpf_xdp_link_attach(attr, prog); > diff --git a/net/core/skmsg.c b/net/core/skmsg.c > index 4d75ef9d24bf..1aa900ad54d7 100644 > --- a/net/core/skmsg.c > +++ b/net/core/skmsg.c > @@ -1256,3 +1256,167 @@ void sk_psock_stop_verdict(struct sock *sk, struct sk_psock *psock) > sk->sk_data_ready = psock->saved_data_ready; > psock->saved_data_ready = NULL; > } > + > +struct bpf_sk_msg_skb_link { > + struct bpf_link link; > + struct bpf_map *map; > + enum bpf_attach_type attach_type; > +}; > + > +static DEFINE_MUTEX(link_mutex); maybe more specific name, sockmap_link_mutex? link_mutex sounds very generic > + > +static struct bpf_sk_msg_skb_link *bpf_sk_msg_skb_link(const struct bpf_link *link) > +{ > + return container_of(link, struct bpf_sk_msg_skb_link, link); > +} > + [...] > + attach_type = attr->link_create.attach_type; > + bpf_link_init(&sk_link->link, link_type, &bpf_sk_msg_skb_link_ops, prog); > + sk_link->map = map; > + sk_link->attach_type = attach_type; > + > + ret = bpf_link_prime(&sk_link->link, &link_primer); > + if (ret) { > + kfree(sk_link); > + goto out; > + } > + > + ret = sock_map_prog_update(map, prog, NULL, attach_type); Does anything prevent someone else do to remove this program from user-space, bypassing the link? It's a guarantee of a link that attachment won't be tampered with (except for SYS_ADMIN-only force-detachment, which is a completely separate thing). It feels like there should be some sort of protection for programs attached through sockmap link here. Just like we have this for XDP, for example, or any of cgroup BPF programs attached through BPF link. > + if (ret) { > + bpf_link_cleanup(&link_primer); > + goto out; > + } > + > + bpf_prog_inc(prog); > + > + return bpf_link_settle(&link_primer); > + > +out: > + bpf_map_put_with_uref(map); > + return ret; > +} [...]
On 3/22/24 11:45 AM, Andrii Nakryiko wrote: > On Tue, Mar 19, 2024 at 10:54 AM Yonghong Song <yonghong.song@linux.dev> wrote: >> Add bpf_link support for sk_msg and sk_skb programs. We have an >> internal request to support bpf_link for sk_msg programs so user >> space can have a uniform handling with bpf_link based libbpf >> APIs. Using bpf_link based libbpf API also has a benefit which >> makes system robust by decoupling prog life cycle and >> attachment life cycle. >> >> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> >> --- >> include/linux/bpf.h | 13 +++ >> include/uapi/linux/bpf.h | 10 ++ >> kernel/bpf/syscall.c | 4 + >> net/core/skmsg.c | 164 +++++++++++++++++++++++++++++++++ >> net/core/sock_map.c | 6 +- >> tools/include/uapi/linux/bpf.h | 10 ++ >> 6 files changed, 203 insertions(+), 4 deletions(-) >> > [...] > >> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h >> index 3c42b9f1bada..c5506cfca4f8 100644 >> --- a/include/uapi/linux/bpf.h >> +++ b/include/uapi/linux/bpf.h >> @@ -1135,6 +1135,8 @@ enum bpf_link_type { >> BPF_LINK_TYPE_TCX = 11, >> BPF_LINK_TYPE_UPROBE_MULTI = 12, >> BPF_LINK_TYPE_NETKIT = 13, >> + BPF_LINK_TYPE_SK_MSG = 14, >> + BPF_LINK_TYPE_SK_SKB = 15, > they are both "sockmap attachments", so maybe we should just have > something like BPF_LINK_TYPE_SOCKMAP ? Yes, we could do this. Basically it represents all programs which can be attached to sockmap. > >> __MAX_BPF_LINK_TYPE, >> }; >> >> @@ -6718,6 +6720,14 @@ struct bpf_link_info { >> __u32 ifindex; >> __u32 attach_type; >> } netkit; >> + struct { >> + __u32 map_id; >> + __u32 attach_type; >> + } skmsg; >> + struct { >> + __u32 map_id; >> + __u32 attach_type; >> + } skskb; > and then this would be also just one struct, instead of two identical > ones duplicated Right, we could do one with name 'sockmap'. > >> }; >> } __attribute__((aligned(8))); >> >> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c >> index ae2ff73bde7e..3d13eec5a30d 100644 >> --- a/kernel/bpf/syscall.c >> +++ b/kernel/bpf/syscall.c >> @@ -5213,6 +5213,10 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr) >> case BPF_PROG_TYPE_SK_LOOKUP: >> ret = netns_bpf_link_create(attr, prog); >> break; >> + case BPF_PROG_TYPE_SK_MSG: >> + case BPF_PROG_TYPE_SK_SKB: >> + ret = bpf_sk_msg_skb_link_create(attr, prog); >> + break; >> #ifdef CONFIG_NET >> case BPF_PROG_TYPE_XDP: >> ret = bpf_xdp_link_attach(attr, prog); >> diff --git a/net/core/skmsg.c b/net/core/skmsg.c >> index 4d75ef9d24bf..1aa900ad54d7 100644 >> --- a/net/core/skmsg.c >> +++ b/net/core/skmsg.c >> @@ -1256,3 +1256,167 @@ void sk_psock_stop_verdict(struct sock *sk, struct sk_psock *psock) >> sk->sk_data_ready = psock->saved_data_ready; >> psock->saved_data_ready = NULL; >> } >> + >> +struct bpf_sk_msg_skb_link { >> + struct bpf_link link; >> + struct bpf_map *map; >> + enum bpf_attach_type attach_type; >> +}; >> + >> +static DEFINE_MUTEX(link_mutex); > maybe more specific name, sockmap_link_mutex? link_mutex sounds very generic Good idea. > >> + >> +static struct bpf_sk_msg_skb_link *bpf_sk_msg_skb_link(const struct bpf_link *link) >> +{ >> + return container_of(link, struct bpf_sk_msg_skb_link, link); >> +} >> + > [...] > >> + attach_type = attr->link_create.attach_type; >> + bpf_link_init(&sk_link->link, link_type, &bpf_sk_msg_skb_link_ops, prog); >> + sk_link->map = map; >> + sk_link->attach_type = attach_type; >> + >> + ret = bpf_link_prime(&sk_link->link, &link_primer); >> + if (ret) { >> + kfree(sk_link); >> + goto out; >> + } >> + >> + ret = sock_map_prog_update(map, prog, NULL, attach_type); > Does anything prevent someone else do to remove this program from > user-space, bypassing the link? It's a guarantee of a link that > attachment won't be tampered with (except for SYS_ADMIN-only > force-detachment, which is a completely separate thing). > > It feels like there should be some sort of protection for programs > attached through sockmap link here. Just like we have this for XDP, > for example, or any of cgroup BPF programs attached through BPF link. Good point. I have a 'bpf_prog_inc(prog)' below, I could do a refcount increase before sock_map_prog_update(), we then should be okay. > >> + if (ret) { >> + bpf_link_cleanup(&link_primer); >> + goto out; >> + } >> + >> + bpf_prog_inc(prog); >> + >> + return bpf_link_settle(&link_primer); >> + >> +out: >> + bpf_map_put_with_uref(map); >> + return ret; >> +} > [...]
On Fri, Mar 22, 2024 at 12:18 PM Yonghong Song <yonghong.song@linux.dev> wrote: > > > On 3/22/24 11:45 AM, Andrii Nakryiko wrote: > > On Tue, Mar 19, 2024 at 10:54 AM Yonghong Song <yonghong.song@linux.dev> wrote: > >> Add bpf_link support for sk_msg and sk_skb programs. We have an > >> internal request to support bpf_link for sk_msg programs so user > >> space can have a uniform handling with bpf_link based libbpf > >> APIs. Using bpf_link based libbpf API also has a benefit which > >> makes system robust by decoupling prog life cycle and > >> attachment life cycle. > >> > >> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> > >> --- > >> include/linux/bpf.h | 13 +++ > >> include/uapi/linux/bpf.h | 10 ++ > >> kernel/bpf/syscall.c | 4 + > >> net/core/skmsg.c | 164 +++++++++++++++++++++++++++++++++ > >> net/core/sock_map.c | 6 +- > >> tools/include/uapi/linux/bpf.h | 10 ++ > >> 6 files changed, 203 insertions(+), 4 deletions(-) > >> > > [...] > > > >> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > >> index 3c42b9f1bada..c5506cfca4f8 100644 > >> --- a/include/uapi/linux/bpf.h > >> +++ b/include/uapi/linux/bpf.h > >> @@ -1135,6 +1135,8 @@ enum bpf_link_type { > >> BPF_LINK_TYPE_TCX = 11, > >> BPF_LINK_TYPE_UPROBE_MULTI = 12, > >> BPF_LINK_TYPE_NETKIT = 13, > >> + BPF_LINK_TYPE_SK_MSG = 14, > >> + BPF_LINK_TYPE_SK_SKB = 15, > > they are both "sockmap attachments", so maybe we should just have > > something like BPF_LINK_TYPE_SOCKMAP ? > > Yes, we could do this. Basically it represents all programs > which can be attached to sockmap. > > > > >> __MAX_BPF_LINK_TYPE, > >> }; > >> > >> @@ -6718,6 +6720,14 @@ struct bpf_link_info { > >> __u32 ifindex; > >> __u32 attach_type; > >> } netkit; > >> + struct { > >> + __u32 map_id; > >> + __u32 attach_type; > >> + } skmsg; > >> + struct { > >> + __u32 map_id; > >> + __u32 attach_type; > >> + } skskb; > > and then this would be also just one struct, instead of two identical > > ones duplicated > > Right, we could do one with name 'sockmap'. > > > > >> }; > >> } __attribute__((aligned(8))); > >> > >> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > >> index ae2ff73bde7e..3d13eec5a30d 100644 > >> --- a/kernel/bpf/syscall.c > >> +++ b/kernel/bpf/syscall.c > >> @@ -5213,6 +5213,10 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr) > >> case BPF_PROG_TYPE_SK_LOOKUP: > >> ret = netns_bpf_link_create(attr, prog); > >> break; > >> + case BPF_PROG_TYPE_SK_MSG: > >> + case BPF_PROG_TYPE_SK_SKB: > >> + ret = bpf_sk_msg_skb_link_create(attr, prog); > >> + break; > >> #ifdef CONFIG_NET > >> case BPF_PROG_TYPE_XDP: > >> ret = bpf_xdp_link_attach(attr, prog); > >> diff --git a/net/core/skmsg.c b/net/core/skmsg.c > >> index 4d75ef9d24bf..1aa900ad54d7 100644 > >> --- a/net/core/skmsg.c > >> +++ b/net/core/skmsg.c > >> @@ -1256,3 +1256,167 @@ void sk_psock_stop_verdict(struct sock *sk, struct sk_psock *psock) > >> sk->sk_data_ready = psock->saved_data_ready; > >> psock->saved_data_ready = NULL; > >> } > >> + > >> +struct bpf_sk_msg_skb_link { > >> + struct bpf_link link; > >> + struct bpf_map *map; > >> + enum bpf_attach_type attach_type; > >> +}; > >> + > >> +static DEFINE_MUTEX(link_mutex); > > maybe more specific name, sockmap_link_mutex? link_mutex sounds very generic > > Good idea. > > > > >> + > >> +static struct bpf_sk_msg_skb_link *bpf_sk_msg_skb_link(const struct bpf_link *link) > >> +{ > >> + return container_of(link, struct bpf_sk_msg_skb_link, link); > >> +} > >> + > > [...] > > > >> + attach_type = attr->link_create.attach_type; > >> + bpf_link_init(&sk_link->link, link_type, &bpf_sk_msg_skb_link_ops, prog); > >> + sk_link->map = map; > >> + sk_link->attach_type = attach_type; > >> + > >> + ret = bpf_link_prime(&sk_link->link, &link_primer); > >> + if (ret) { > >> + kfree(sk_link); > >> + goto out; > >> + } > >> + > >> + ret = sock_map_prog_update(map, prog, NULL, attach_type); > > Does anything prevent someone else do to remove this program from > > user-space, bypassing the link? It's a guarantee of a link that > > attachment won't be tampered with (except for SYS_ADMIN-only > > force-detachment, which is a completely separate thing). > > > > It feels like there should be some sort of protection for programs > > attached through sockmap link here. Just like we have this for XDP, > > for example, or any of cgroup BPF programs attached through BPF link. > > Good point. I have a 'bpf_prog_inc(prog)' below, I could do a refcount increase > before sock_map_prog_update(), we then should be okay. > My point was that once you attach a program to sockmap with LINK_CREATE, someone else just doing bpf_prog_attac() shouldn't replace this program anymore. BPF link preserves *attachment lifetime*, not just the program lifetime. > > > >> + if (ret) { > >> + bpf_link_cleanup(&link_primer); > >> + goto out; > >> + } > >> + > >> + bpf_prog_inc(prog); > >> + > >> + return bpf_link_settle(&link_primer); > >> + > >> +out: > >> + bpf_map_put_with_uref(map); > >> + return ret; > >> +} > > [...]
On 3/22/24 1:16 PM, Andrii Nakryiko wrote: > On Fri, Mar 22, 2024 at 12:18 PM Yonghong Song <yonghong.song@linux.dev> wrote: >> >> On 3/22/24 11:45 AM, Andrii Nakryiko wrote: >>> On Tue, Mar 19, 2024 at 10:54 AM Yonghong Song <yonghong.song@linux.dev> wrote: >>>> Add bpf_link support for sk_msg and sk_skb programs. We have an >>>> internal request to support bpf_link for sk_msg programs so user >>>> space can have a uniform handling with bpf_link based libbpf >>>> APIs. Using bpf_link based libbpf API also has a benefit which >>>> makes system robust by decoupling prog life cycle and >>>> attachment life cycle. >>>> >>>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> >>>> --- >>>> include/linux/bpf.h | 13 +++ >>>> include/uapi/linux/bpf.h | 10 ++ >>>> kernel/bpf/syscall.c | 4 + >>>> net/core/skmsg.c | 164 +++++++++++++++++++++++++++++++++ >>>> net/core/sock_map.c | 6 +- >>>> tools/include/uapi/linux/bpf.h | 10 ++ >>>> 6 files changed, 203 insertions(+), 4 deletions(-) >>>> >>> [...] >>> >>>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h >>>> index 3c42b9f1bada..c5506cfca4f8 100644 >>>> --- a/include/uapi/linux/bpf.h >>>> +++ b/include/uapi/linux/bpf.h >>>> @@ -1135,6 +1135,8 @@ enum bpf_link_type { >>>> BPF_LINK_TYPE_TCX = 11, >>>> BPF_LINK_TYPE_UPROBE_MULTI = 12, >>>> BPF_LINK_TYPE_NETKIT = 13, >>>> + BPF_LINK_TYPE_SK_MSG = 14, >>>> + BPF_LINK_TYPE_SK_SKB = 15, >>> they are both "sockmap attachments", so maybe we should just have >>> something like BPF_LINK_TYPE_SOCKMAP ? >> Yes, we could do this. Basically it represents all programs >> which can be attached to sockmap. >> >>>> __MAX_BPF_LINK_TYPE, >>>> }; >>>> >>>> @@ -6718,6 +6720,14 @@ struct bpf_link_info { >>>> __u32 ifindex; >>>> __u32 attach_type; >>>> } netkit; >>>> + struct { >>>> + __u32 map_id; >>>> + __u32 attach_type; >>>> + } skmsg; >>>> + struct { >>>> + __u32 map_id; >>>> + __u32 attach_type; >>>> + } skskb; >>> and then this would be also just one struct, instead of two identical >>> ones duplicated >> Right, we could do one with name 'sockmap'. >> >>>> }; >>>> } __attribute__((aligned(8))); >>>> >>>> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c >>>> index ae2ff73bde7e..3d13eec5a30d 100644 >>>> --- a/kernel/bpf/syscall.c >>>> +++ b/kernel/bpf/syscall.c >>>> @@ -5213,6 +5213,10 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr) >>>> case BPF_PROG_TYPE_SK_LOOKUP: >>>> ret = netns_bpf_link_create(attr, prog); >>>> break; >>>> + case BPF_PROG_TYPE_SK_MSG: >>>> + case BPF_PROG_TYPE_SK_SKB: >>>> + ret = bpf_sk_msg_skb_link_create(attr, prog); >>>> + break; >>>> #ifdef CONFIG_NET >>>> case BPF_PROG_TYPE_XDP: >>>> ret = bpf_xdp_link_attach(attr, prog); >>>> diff --git a/net/core/skmsg.c b/net/core/skmsg.c >>>> index 4d75ef9d24bf..1aa900ad54d7 100644 >>>> --- a/net/core/skmsg.c >>>> +++ b/net/core/skmsg.c >>>> @@ -1256,3 +1256,167 @@ void sk_psock_stop_verdict(struct sock *sk, struct sk_psock *psock) >>>> sk->sk_data_ready = psock->saved_data_ready; >>>> psock->saved_data_ready = NULL; >>>> } >>>> + >>>> +struct bpf_sk_msg_skb_link { >>>> + struct bpf_link link; >>>> + struct bpf_map *map; >>>> + enum bpf_attach_type attach_type; >>>> +}; >>>> + >>>> +static DEFINE_MUTEX(link_mutex); >>> maybe more specific name, sockmap_link_mutex? link_mutex sounds very generic >> Good idea. >> >>>> + >>>> +static struct bpf_sk_msg_skb_link *bpf_sk_msg_skb_link(const struct bpf_link *link) >>>> +{ >>>> + return container_of(link, struct bpf_sk_msg_skb_link, link); >>>> +} >>>> + >>> [...] >>> >>>> + attach_type = attr->link_create.attach_type; >>>> + bpf_link_init(&sk_link->link, link_type, &bpf_sk_msg_skb_link_ops, prog); >>>> + sk_link->map = map; >>>> + sk_link->attach_type = attach_type; >>>> + >>>> + ret = bpf_link_prime(&sk_link->link, &link_primer); >>>> + if (ret) { >>>> + kfree(sk_link); >>>> + goto out; >>>> + } >>>> + >>>> + ret = sock_map_prog_update(map, prog, NULL, attach_type); >>> Does anything prevent someone else do to remove this program from >>> user-space, bypassing the link? It's a guarantee of a link that >>> attachment won't be tampered with (except for SYS_ADMIN-only >>> force-detachment, which is a completely separate thing). >>> >>> It feels like there should be some sort of protection for programs >>> attached through sockmap link here. Just like we have this for XDP, >>> for example, or any of cgroup BPF programs attached through BPF link. >> Good point. I have a 'bpf_prog_inc(prog)' below, I could do a refcount increase >> before sock_map_prog_update(), we then should be okay. >> > My point was that once you attach a program to sockmap with > LINK_CREATE, someone else just doing bpf_prog_attac() shouldn't > replace this program anymore. BPF link preserves *attachment > lifetime*, not just the program lifetime. I see. I briefly looked at xdp and cgroup where both support link-based attachment and program base attachment. For xdp, indeed if link attachment is already done, prog attach is not allowed. cgroup seems more complicated and need further study to confirm whether BPF link preserves attachment in all cases or not. But any way, since bpf_link is a new feature for sockmap. It makes sense to follow the xdp-link style to disallow bpf_prog_attach once link is created. > >>>> + if (ret) { >>>> + bpf_link_cleanup(&link_primer); >>>> + goto out; >>>> + } >>>> + >>>> + bpf_prog_inc(prog); >>>> + >>>> + return bpf_link_settle(&link_primer); >>>> + >>>> +out: >>>> + bpf_map_put_with_uref(map); >>>> + return ret; >>>> +} >>> [...]
On Fri, Mar 22, 2024 at 2:34 PM Yonghong Song <yonghong.song@linux.dev> wrote: > > > On 3/22/24 1:16 PM, Andrii Nakryiko wrote: > > On Fri, Mar 22, 2024 at 12:18 PM Yonghong Song <yonghong.song@linux.dev> wrote: > >> > >> On 3/22/24 11:45 AM, Andrii Nakryiko wrote: > >>> On Tue, Mar 19, 2024 at 10:54 AM Yonghong Song <yonghong.song@linux.dev> wrote: > >>>> Add bpf_link support for sk_msg and sk_skb programs. We have an > >>>> internal request to support bpf_link for sk_msg programs so user > >>>> space can have a uniform handling with bpf_link based libbpf > >>>> APIs. Using bpf_link based libbpf API also has a benefit which > >>>> makes system robust by decoupling prog life cycle and > >>>> attachment life cycle. > >>>> > >>>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> > >>>> --- > >>>> include/linux/bpf.h | 13 +++ > >>>> include/uapi/linux/bpf.h | 10 ++ > >>>> kernel/bpf/syscall.c | 4 + > >>>> net/core/skmsg.c | 164 +++++++++++++++++++++++++++++++++ > >>>> net/core/sock_map.c | 6 +- > >>>> tools/include/uapi/linux/bpf.h | 10 ++ > >>>> 6 files changed, 203 insertions(+), 4 deletions(-) > >>>> > >>> [...] > >>> > >>>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > >>>> index 3c42b9f1bada..c5506cfca4f8 100644 > >>>> --- a/include/uapi/linux/bpf.h > >>>> +++ b/include/uapi/linux/bpf.h > >>>> @@ -1135,6 +1135,8 @@ enum bpf_link_type { > >>>> BPF_LINK_TYPE_TCX = 11, > >>>> BPF_LINK_TYPE_UPROBE_MULTI = 12, > >>>> BPF_LINK_TYPE_NETKIT = 13, > >>>> + BPF_LINK_TYPE_SK_MSG = 14, > >>>> + BPF_LINK_TYPE_SK_SKB = 15, > >>> they are both "sockmap attachments", so maybe we should just have > >>> something like BPF_LINK_TYPE_SOCKMAP ? > >> Yes, we could do this. Basically it represents all programs > >> which can be attached to sockmap. > >> > >>>> __MAX_BPF_LINK_TYPE, > >>>> }; > >>>> > >>>> @@ -6718,6 +6720,14 @@ struct bpf_link_info { > >>>> __u32 ifindex; > >>>> __u32 attach_type; > >>>> } netkit; > >>>> + struct { > >>>> + __u32 map_id; > >>>> + __u32 attach_type; > >>>> + } skmsg; > >>>> + struct { > >>>> + __u32 map_id; > >>>> + __u32 attach_type; > >>>> + } skskb; > >>> and then this would be also just one struct, instead of two identical > >>> ones duplicated > >> Right, we could do one with name 'sockmap'. > >> > >>>> }; > >>>> } __attribute__((aligned(8))); > >>>> > >>>> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > >>>> index ae2ff73bde7e..3d13eec5a30d 100644 > >>>> --- a/kernel/bpf/syscall.c > >>>> +++ b/kernel/bpf/syscall.c > >>>> @@ -5213,6 +5213,10 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr) > >>>> case BPF_PROG_TYPE_SK_LOOKUP: > >>>> ret = netns_bpf_link_create(attr, prog); > >>>> break; > >>>> + case BPF_PROG_TYPE_SK_MSG: > >>>> + case BPF_PROG_TYPE_SK_SKB: > >>>> + ret = bpf_sk_msg_skb_link_create(attr, prog); > >>>> + break; > >>>> #ifdef CONFIG_NET > >>>> case BPF_PROG_TYPE_XDP: > >>>> ret = bpf_xdp_link_attach(attr, prog); > >>>> diff --git a/net/core/skmsg.c b/net/core/skmsg.c > >>>> index 4d75ef9d24bf..1aa900ad54d7 100644 > >>>> --- a/net/core/skmsg.c > >>>> +++ b/net/core/skmsg.c > >>>> @@ -1256,3 +1256,167 @@ void sk_psock_stop_verdict(struct sock *sk, struct sk_psock *psock) > >>>> sk->sk_data_ready = psock->saved_data_ready; > >>>> psock->saved_data_ready = NULL; > >>>> } > >>>> + > >>>> +struct bpf_sk_msg_skb_link { > >>>> + struct bpf_link link; > >>>> + struct bpf_map *map; > >>>> + enum bpf_attach_type attach_type; > >>>> +}; > >>>> + > >>>> +static DEFINE_MUTEX(link_mutex); > >>> maybe more specific name, sockmap_link_mutex? link_mutex sounds very generic > >> Good idea. > >> > >>>> + > >>>> +static struct bpf_sk_msg_skb_link *bpf_sk_msg_skb_link(const struct bpf_link *link) > >>>> +{ > >>>> + return container_of(link, struct bpf_sk_msg_skb_link, link); > >>>> +} > >>>> + > >>> [...] > >>> > >>>> + attach_type = attr->link_create.attach_type; > >>>> + bpf_link_init(&sk_link->link, link_type, &bpf_sk_msg_skb_link_ops, prog); > >>>> + sk_link->map = map; > >>>> + sk_link->attach_type = attach_type; > >>>> + > >>>> + ret = bpf_link_prime(&sk_link->link, &link_primer); > >>>> + if (ret) { > >>>> + kfree(sk_link); > >>>> + goto out; > >>>> + } > >>>> + > >>>> + ret = sock_map_prog_update(map, prog, NULL, attach_type); > >>> Does anything prevent someone else do to remove this program from > >>> user-space, bypassing the link? It's a guarantee of a link that > >>> attachment won't be tampered with (except for SYS_ADMIN-only > >>> force-detachment, which is a completely separate thing). > >>> > >>> It feels like there should be some sort of protection for programs > >>> attached through sockmap link here. Just like we have this for XDP, > >>> for example, or any of cgroup BPF programs attached through BPF link. > >> Good point. I have a 'bpf_prog_inc(prog)' below, I could do a refcount increase > >> before sock_map_prog_update(), we then should be okay. > >> > > My point was that once you attach a program to sockmap with > > LINK_CREATE, someone else just doing bpf_prog_attac() shouldn't > > replace this program anymore. BPF link preserves *attachment > > lifetime*, not just the program lifetime. > > I see. I briefly looked at xdp and cgroup where both support > link-based attachment and program base attachment. > For xdp, indeed if link attachment is already done, > prog attach is not allowed. > cgroup seems more complicated and need further study to > confirm whether BPF link preserves attachment in all > cases or not. it does, I implemented it, it was an explicit design decision > But any way, since bpf_link is a new feature for > sockmap. It makes sense to follow the xdp-link > style to disallow bpf_prog_attach once link is > created. > > > > >>>> + if (ret) { > >>>> + bpf_link_cleanup(&link_primer); > >>>> + goto out; > >>>> + } > >>>> + > >>>> + bpf_prog_inc(prog); > >>>> + > >>>> + return bpf_link_settle(&link_primer); > >>>> + > >>>> +out: > >>>> + bpf_map_put_with_uref(map); > >>>> + return ret; > >>>> +} > >>> [...]
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 17843e66a1d3..8dabd47d3668 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2990,10 +2990,14 @@ int sock_map_prog_detach(const union bpf_attr *attr, enum bpf_prog_type ptype); int sock_map_update_elem_sys(struct bpf_map *map, void *key, void *value, u64 flags); int sock_map_bpf_prog_query(const union bpf_attr *attr, union bpf_attr __user *uattr); +int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog, + struct bpf_prog *old, u32 which); void sock_map_unhash(struct sock *sk); void sock_map_destroy(struct sock *sk); void sock_map_close(struct sock *sk, long timeout); + +int bpf_sk_msg_skb_link_create(const union bpf_attr *attr, struct bpf_prog *prog); #else static inline int bpf_dev_bound_kfunc_check(struct bpf_verifier_log *log, struct bpf_prog_aux *prog_aux) @@ -3088,6 +3092,15 @@ static inline int sock_map_bpf_prog_query(const union bpf_attr *attr, { return -EINVAL; } +static inline int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog, + struct bpf_prog *old, u32 which) +{ + return -EOPNOTSUPP; +} +static inline int bpf_sk_msg_skb_link_create(const union bpf_attr *attr, struct bpf_prog *prog) +{ + return -EOPNOTSUPP; +} #endif /* CONFIG_BPF_SYSCALL */ #endif /* CONFIG_NET && CONFIG_BPF_SYSCALL */ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 3c42b9f1bada..c5506cfca4f8 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1135,6 +1135,8 @@ enum bpf_link_type { BPF_LINK_TYPE_TCX = 11, BPF_LINK_TYPE_UPROBE_MULTI = 12, BPF_LINK_TYPE_NETKIT = 13, + BPF_LINK_TYPE_SK_MSG = 14, + BPF_LINK_TYPE_SK_SKB = 15, __MAX_BPF_LINK_TYPE, }; @@ -6718,6 +6720,14 @@ struct bpf_link_info { __u32 ifindex; __u32 attach_type; } netkit; + struct { + __u32 map_id; + __u32 attach_type; + } skmsg; + struct { + __u32 map_id; + __u32 attach_type; + } skskb; }; } __attribute__((aligned(8))); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index ae2ff73bde7e..3d13eec5a30d 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -5213,6 +5213,10 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr) case BPF_PROG_TYPE_SK_LOOKUP: ret = netns_bpf_link_create(attr, prog); break; + case BPF_PROG_TYPE_SK_MSG: + case BPF_PROG_TYPE_SK_SKB: + ret = bpf_sk_msg_skb_link_create(attr, prog); + break; #ifdef CONFIG_NET case BPF_PROG_TYPE_XDP: ret = bpf_xdp_link_attach(attr, prog); diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 4d75ef9d24bf..1aa900ad54d7 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -1256,3 +1256,167 @@ void sk_psock_stop_verdict(struct sock *sk, struct sk_psock *psock) sk->sk_data_ready = psock->saved_data_ready; psock->saved_data_ready = NULL; } + +struct bpf_sk_msg_skb_link { + struct bpf_link link; + struct bpf_map *map; + enum bpf_attach_type attach_type; +}; + +static DEFINE_MUTEX(link_mutex); + +static struct bpf_sk_msg_skb_link *bpf_sk_msg_skb_link(const struct bpf_link *link) +{ + return container_of(link, struct bpf_sk_msg_skb_link, link); +} + +static void bpf_sk_msg_skb_link_release(struct bpf_link *link) +{ + struct bpf_sk_msg_skb_link *sk_link = bpf_sk_msg_skb_link(link); + + mutex_lock(&link_mutex); + if (sk_link->map) { + (void)sock_map_prog_update(sk_link->map, NULL, link->prog, + sk_link->attach_type); + bpf_map_put_with_uref(sk_link->map); + sk_link->map = NULL; + } + mutex_unlock(&link_mutex); +} + +static int bpf_sk_msg_skb_link_detach(struct bpf_link *link) +{ + bpf_sk_msg_skb_link_release(link); + return 0; +} + +static void bpf_sk_msg_skb_link_dealloc(struct bpf_link *link) +{ + kfree(bpf_sk_msg_skb_link(link)); +} + +static int bpf_sk_msg_skb_link_update_prog(struct bpf_link *link, + struct bpf_prog *new_prog, + struct bpf_prog *old_prog) +{ + const struct bpf_sk_msg_skb_link *sk_link = bpf_sk_msg_skb_link(link); + int ret = 0; + + mutex_lock(&link_mutex); + if (old_prog && link->prog != old_prog) { + ret = -EPERM; + goto out; + } + + if (link->prog->type != new_prog->type) { + ret = -EINVAL; + goto out; + } + + ret = sock_map_prog_update(sk_link->map, new_prog, old_prog, + sk_link->attach_type); + if (!ret) + bpf_prog_inc(new_prog); + +out: + mutex_unlock(&link_mutex); + return ret; +} + +static int bpf_sk_msg_skb_link_fill_info(const struct bpf_link *link, + struct bpf_link_info *info) +{ + const struct bpf_sk_msg_skb_link *sk_link = bpf_sk_msg_skb_link(link); + u32 map_id = 0; + + mutex_lock(&link_mutex); + if (sk_link->map) + map_id = sk_link->map->id; + mutex_unlock(&link_mutex); + + if (link->type == BPF_LINK_TYPE_SK_MSG) { + info->skmsg.map_id = map_id; + info->skmsg.attach_type = sk_link->attach_type; + } else { + info->skskb.map_id = map_id; + info->skskb.attach_type = sk_link->attach_type; + } + return 0; +} + +static void bpf_sk_msg_skb_link_show_fdinfo(const struct bpf_link *link, + struct seq_file *seq) +{ + const struct bpf_sk_msg_skb_link *sk_link = bpf_sk_msg_skb_link(link); + u32 map_id = 0; + + mutex_lock(&link_mutex); + if (sk_link->map) + map_id = sk_link->map->id; + mutex_unlock(&link_mutex); + + seq_printf(seq, "map_id:\t%u\n", map_id); + seq_printf(seq, "attach_type:\t%u (...)\n", sk_link->attach_type); +} + +static const struct bpf_link_ops bpf_sk_msg_skb_link_ops = { + .release = bpf_sk_msg_skb_link_release, + .dealloc = bpf_sk_msg_skb_link_dealloc, + .detach = bpf_sk_msg_skb_link_detach, + .update_prog = bpf_sk_msg_skb_link_update_prog, + .fill_link_info = bpf_sk_msg_skb_link_fill_info, + .show_fdinfo = bpf_sk_msg_skb_link_show_fdinfo, +}; + +int bpf_sk_msg_skb_link_create(const union bpf_attr *attr, struct bpf_prog *prog) +{ + struct bpf_link_primer link_primer; + struct bpf_sk_msg_skb_link *sk_link; + enum bpf_attach_type attach_type; + enum bpf_link_type link_type; + struct bpf_map *map; + int ret; + + if (attr->link_create.flags) + return -EINVAL; + + map = bpf_map_get_with_uref(attr->link_create.target_fd); + if (IS_ERR(map)) + return PTR_ERR(map); + + sk_link = kzalloc(sizeof(*sk_link), GFP_USER); + if (!sk_link) { + ret = -ENOMEM; + goto out; + } + + if (prog->type == BPF_PROG_TYPE_SK_MSG) + link_type = BPF_LINK_TYPE_SK_MSG; + else + link_type = BPF_LINK_TYPE_SK_SKB; + + attach_type = attr->link_create.attach_type; + bpf_link_init(&sk_link->link, link_type, &bpf_sk_msg_skb_link_ops, prog); + sk_link->map = map; + sk_link->attach_type = attach_type; + + ret = bpf_link_prime(&sk_link->link, &link_primer); + if (ret) { + kfree(sk_link); + goto out; + } + + ret = sock_map_prog_update(map, prog, NULL, attach_type); + if (ret) { + bpf_link_cleanup(&link_primer); + goto out; + } + + bpf_prog_inc(prog); + + return bpf_link_settle(&link_primer); + +out: + bpf_map_put_with_uref(map); + return ret; +} diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 27d733c0f65e..63372bc368f1 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -24,8 +24,6 @@ struct bpf_stab { #define SOCK_CREATE_FLAG_MASK \ (BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY) -static int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog, - struct bpf_prog *old, u32 which); static struct sk_psock_progs *sock_map_progs(struct bpf_map *map); static struct bpf_map *sock_map_alloc(union bpf_attr *attr) @@ -1488,8 +1486,8 @@ static int sock_map_prog_lookup(struct bpf_map *map, struct bpf_prog ***pprog, return 0; } -static int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog, - struct bpf_prog *old, u32 which) +int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog, + struct bpf_prog *old, u32 which) { struct bpf_prog **pprog; int ret; diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 3c42b9f1bada..c5506cfca4f8 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1135,6 +1135,8 @@ enum bpf_link_type { BPF_LINK_TYPE_TCX = 11, BPF_LINK_TYPE_UPROBE_MULTI = 12, BPF_LINK_TYPE_NETKIT = 13, + BPF_LINK_TYPE_SK_MSG = 14, + BPF_LINK_TYPE_SK_SKB = 15, __MAX_BPF_LINK_TYPE, }; @@ -6718,6 +6720,14 @@ struct bpf_link_info { __u32 ifindex; __u32 attach_type; } netkit; + struct { + __u32 map_id; + __u32 attach_type; + } skmsg; + struct { + __u32 map_id; + __u32 attach_type; + } skskb; }; } __attribute__((aligned(8)));
Add bpf_link support for sk_msg and sk_skb programs. We have an internal request to support bpf_link for sk_msg programs so user space can have a uniform handling with bpf_link based libbpf APIs. Using bpf_link based libbpf API also has a benefit which makes system robust by decoupling prog life cycle and attachment life cycle. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> --- include/linux/bpf.h | 13 +++ include/uapi/linux/bpf.h | 10 ++ kernel/bpf/syscall.c | 4 + net/core/skmsg.c | 164 +++++++++++++++++++++++++++++++++ net/core/sock_map.c | 6 +- tools/include/uapi/linux/bpf.h | 10 ++ 6 files changed, 203 insertions(+), 4 deletions(-)