From patchwork Tue Nov 28 02:30:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Quentin Deslandes X-Patchwork-Id: 13470461 X-Patchwork-Delegate: dsahern@gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=none Received: from 2.mo545.mail-out.ovh.net (2.mo545.mail-out.ovh.net [178.33.110.194]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 60F6E192 for ; Mon, 27 Nov 2023 18:50:02 -0800 (PST) Received: from ex4.mail.ovh.net (unknown [10.111.208.147]) by mo545.mail-out.ovh.net (Postfix) with ESMTPS id E601624BB4; Tue, 28 Nov 2023 02:34:47 +0000 (UTC) Received: from bf-dev-miffies.localdomain (92.184.96.55) by DAG10EX1.indiv4.local (172.16.2.91) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Tue, 28 Nov 2023 03:31:25 +0100 From: Quentin Deslandes To: CC: David Ahern , Martin KaFai Lau , Quentin Deslandes Subject: [PATCH 1/3] ss: prevent "Process" column from being printed unless requested Date: Mon, 27 Nov 2023 18:30:56 -0800 Message-ID: <20231128023058.53546-2-qde@naccy.de> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231128023058.53546-1-qde@naccy.de> References: <20231128023058.53546-1-qde@naccy.de> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: CAS12.indiv4.local (172.16.1.12) To DAG10EX1.indiv4.local (172.16.2.91) X-Ovh-Tracer-Id: 5892960114290978472 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvkedrudeivddggeelucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecuhedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvvefufffkofgjfhgggfgtihesthekredtredttdenucfhrhhomhepsfhuvghnthhinhcuffgvshhlrghnuggvshcuoehquggvsehnrggttgihrdguvgeqnecuggftrfgrthhtvghrnhepudeludfgieeltedvleeiueejtdettdevtdfgteetveefjefhffekueejteffgfdunecukfhppeduvdejrddtrddtrddupdelvddrudekgedrleeirdehheenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepihhnvghtpeduvdejrddtrddtrddupdhmrghilhhfrhhomhepoehquggvsehnrggttgihrdguvgeqpdhnsggprhgtphhtthhopedupdhrtghpthhtohepnhgvthguvghvsehvghgvrhdrkhgvrhhnvghlrdhorhhgpdgushgrhhgvrhhnsehgmhgrihhlrdgtohhmpdhmrghrthhinhdrlhgruheskhgvrhhnvghlrdhorhhgpdfovfetjfhoshhtpehmohehgeehpdhmohguvgepshhmthhpohhuth Commit 5883c6eba517 ("ss: show header for --processes/-p") added "Process" to the list of columns printed by ss. However, the "Process" header is now printed even if --processes/-p is not used. This change aims to fix this by moving the COL_PROC column ID to the same index as the corresponding column structure in the columns array, and enabling it if --processes/-p is used. Signed-off-by: Quentin Deslandes --- misc/ss.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/misc/ss.c b/misc/ss.c index 9438382b..09dc1f37 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -100,8 +100,8 @@ enum col_id { COL_SERV, COL_RADDR, COL_RSERV, - COL_EXT, COL_PROC, + COL_EXT, COL_MAX }; @@ -5795,6 +5795,9 @@ int main(int argc, char *argv[]) if (ssfilter_parse(¤t_filter.f, argc, argv, filter_fp)) usage(); + if (!show_processes) + columns[COL_PROC].disabled = 1; + if (!(current_filter.dbs & (current_filter.dbs - 1))) columns[COL_NETID].disabled = 1; From patchwork Tue Nov 28 02:30:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Quentin Deslandes X-Patchwork-Id: 13470460 X-Patchwork-Delegate: dsahern@gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=none X-Greylist: delayed 329 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Mon, 27 Nov 2023 18:40:51 PST Received: from 9.mo511.mail-out.ovh.net (9.mo511.mail-out.ovh.net [213.32.43.193]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F30D518E for ; Mon, 27 Nov 2023 18:40:51 -0800 (PST) Received: from ex4.mail.ovh.net (unknown [10.108.20.48]) by mo511.mail-out.ovh.net (Postfix) with ESMTPS id C503326EF5; Tue, 28 Nov 2023 02:35:20 +0000 (UTC) Received: from bf-dev-miffies.localdomain (92.184.96.55) by DAG10EX1.indiv4.local (172.16.2.91) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Tue, 28 Nov 2023 03:31:57 +0100 From: Quentin Deslandes To: CC: David Ahern , Martin KaFai Lau , Quentin Deslandes Subject: [PATCH 2/3] ss: add support for BPF socket-local storage Date: Mon, 27 Nov 2023 18:30:57 -0800 Message-ID: <20231128023058.53546-3-qde@naccy.de> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231128023058.53546-1-qde@naccy.de> References: <20231128023058.53546-1-qde@naccy.de> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: CAS12.indiv4.local (172.16.1.12) To DAG10EX1.indiv4.local (172.16.2.91) X-Ovh-Tracer-Id: 5901967312906415784 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvkedrudeivddggeelucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecuhedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvvefufffkofgjfhgggfgtihesthekredtredttdenucfhrhhomhepsfhuvghnthhinhcuffgvshhlrghnuggvshcuoehquggvsehnrggttgihrdguvgeqnecuggftrfgrthhtvghrnhepfeduvedtffduleeuudetteffueeijeevfeetieefgfeugfeugeelhfehhfevfeevnecuffhomhgrihhnpehnrhgpmhgrphhsrdhiugenucfkphepuddvjedrtddrtddruddpledvrddukeegrdeliedrheehnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehinhgvthepuddvjedrtddrtddruddpmhgrihhlfhhrohhmpeeoqhguvgesnhgrtggthidruggvqedpnhgspghrtghpthhtohepuddprhgtphhtthhopehnvghtuggvvhesvhhgvghrrdhkvghrnhgvlhdrohhrghdpughsrghhvghrnhesghhmrghilhdrtghomhdpmhgrrhhtihhnrdhlrghusehkvghrnhgvlhdrohhrghdpoffvtefjohhsthepmhhoheduuddpmhhouggvpehsmhhtphhouhht While sock_diag is able to return BPF socket-local storage in response to INET_DIAG_REQ_SK_BPF_STORAGES requests, ss doesn't request it. This change introduces the --bpf-maps and --bpf-map-id= options to request BPF socket-local storage for all SK_STORAGE maps, or only specific ones. The bigger part of this change will check the requested map IDs and ensure they are valid. A new column has been added named "Socket storage" to print a list of map ID a given socket has data defined for. This column is disabled unless --bpf-maps or --bpf-map-id= is used. Signed-off-by: Quentin Deslandes Co-authored-by: Martin KaFai Lau --- misc/ss.c | 273 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 270 insertions(+), 3 deletions(-) diff --git a/misc/ss.c b/misc/ss.c index 09dc1f37..5b255ce3 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -51,6 +51,11 @@ #include #include +#ifdef HAVE_LIBBPF +#include +#include +#endif + #if HAVE_RPC #include #include @@ -101,6 +106,7 @@ enum col_id { COL_RADDR, COL_RSERV, COL_PROC, + COL_SKSTOR, COL_EXT, COL_MAX }; @@ -130,6 +136,7 @@ static struct column columns[] = { { ALIGN_RIGHT, "Peer Address:", " ", 0, 0, 0 }, { ALIGN_LEFT, "Port", "", 0, 0, 0 }, { ALIGN_LEFT, "Process", "", 0, 0, 0 }, + { ALIGN_LEFT, "Socket storage", "", 1, 0, 0 }, { ALIGN_LEFT, "", "", 0, 0, 0 }, }; @@ -3368,6 +3375,222 @@ static void parse_diag_msg(struct nlmsghdr *nlh, struct sockstat *s) memcpy(s->remote.data, r->id.idiag_dst, s->local.bytelen); } +#ifdef HAVE_LIBBPF + +#define MAX_NR_BPF_MAP_ID_OPTS 32 + +struct btf; + +static struct bpf_map_opts { + unsigned int nr_maps; + struct bpf_sk_storage_map_info { + unsigned int id; + int fd; + } maps[MAX_NR_BPF_MAP_ID_OPTS]; + bool show_all; + struct btf *kernel_btf; +} bpf_map_opts; + +static void bpf_map_opts_mixed_error(void) +{ + fprintf(stderr, + "ss: --bpf-maps and --bpf-map-id cannot be used together\n"); +} + +static int bpf_map_opts_add_all(void) +{ + unsigned int i; + unsigned int fd; + uint32_t id = 0; + int r; + + if (bpf_map_opts.nr_maps) { + bpf_map_opts_mixed_error(); + return -1; + } + + while (1) { + struct bpf_map_info info = {}; + uint32_t len = sizeof(info); + + r = bpf_map_get_next_id(id, &id); + if (r) { + if (errno == ENOENT) + break; + + fprintf(stderr, "ss: failed to fetch BPF map ID\n"); + goto err; + } + + fd = bpf_map_get_fd_by_id(id); + if (fd == -1) { + fprintf(stderr, "ss: cannot get fd for BPF map ID %u%s\n", + id, errno == EPERM ? + ": missing root permissions, CAP_BPF, or CAP_SYS_ADMIN" : ""); + goto err; + } + + r = bpf_obj_get_info_by_fd(fd, &info, &len); + if (r) { + fprintf(stderr, "ss: failed to get info for BPF map ID %u\n", + id); + close(fd); + goto err; + } + + if (info.type != BPF_MAP_TYPE_SK_STORAGE) { + close(fd); + continue; + } + + if (bpf_map_opts.nr_maps == MAX_NR_BPF_MAP_ID_OPTS) { + fprintf(stderr, "ss: too many (> %u) BPF socket-local storage maps found, skipping map ID %u\n", + MAX_NR_BPF_MAP_ID_OPTS, id); + close(fd); + continue; + } + + bpf_map_opts.maps[bpf_map_opts.nr_maps].id = id; + bpf_map_opts.maps[bpf_map_opts.nr_maps++].fd = fd; + } + + bpf_map_opts.show_all = true; + + return 0; + +err: + for (i = 0; i < bpf_map_opts.nr_maps; ++i) + close(bpf_map_opts.maps[i].fd); + + return -1; +} + +static int bpf_map_opts_add_id(const char *optarg) +{ + struct bpf_map_info info = {}; + uint32_t len = sizeof(info); + size_t optarg_len; + unsigned long id; + unsigned int i; + char *end; + int fd; + int r; + + if (bpf_map_opts.show_all) { + bpf_map_opts_mixed_error(); + return -1; + } + + optarg_len = strlen(optarg); + id = strtoul(optarg, &end, 0); + if (end != optarg + optarg_len || id == 0 || id > UINT32_MAX) { + fprintf(stderr, "ss: invalid BPF map ID %s\n", optarg); + return -1; + } + + for (i = 0; i < bpf_map_opts.nr_maps; i++) { + if (bpf_map_opts.maps[i].id == id) + return 0; + } + + if (bpf_map_opts.nr_maps == MAX_NR_BPF_MAP_ID_OPTS) { + fprintf(stderr, "ss: too many (> %u) BPF socket-local storage maps found, skipping map ID %lu\n", + MAX_NR_BPF_MAP_ID_OPTS, id); + return 0; + } + + fd = bpf_map_get_fd_by_id(id); + if (fd == -1) { + fprintf(stderr, "ss: cannot get fd for BPF map ID %lu%s\n", + id, errno == EPERM ? + ": missing root permissions, CAP_BPF, or CAP_SYS_ADMIN" : ""); + return -1; + } + + r = bpf_obj_get_info_by_fd(fd, &info, &len); + if (r) { + fprintf(stderr, "ss: failed to get info for BPF map ID %lu\n", id); + close(fd); + return -1; + } + + if (info.type != BPF_MAP_TYPE_SK_STORAGE) { + fprintf(stderr, "ss: BPF map with ID %s has type '%s', expecting 'sk_storage'\n", + optarg, libbpf_bpf_map_type_str(info.type)); + close(fd); + return -1; + } + + bpf_map_opts.maps[bpf_map_opts.nr_maps].id = id; + bpf_map_opts.maps[bpf_map_opts.nr_maps++].fd = fd; + + return 0; +} + +static inline bool bpf_map_opts_is_enabled(void) +{ + return bpf_map_opts.nr_maps; +} + +static struct rtattr *bpf_map_opts_alloc_rta(void) +{ + size_t total_size = RTA_LENGTH(RTA_LENGTH(sizeof(int)) * bpf_map_opts.nr_maps); + struct rtattr *stgs_rta, *fd_rta; + unsigned int i; + void *buf; + + stgs_rta = malloc(RTA_LENGTH(0)); + stgs_rta->rta_len = RTA_LENGTH(0); + stgs_rta->rta_type = INET_DIAG_REQ_SK_BPF_STORAGES | NLA_F_NESTED; + + buf = malloc(total_size); + if (!buf) + return NULL; + + stgs_rta = buf; + stgs_rta->rta_type = INET_DIAG_REQ_SK_BPF_STORAGES | NLA_F_NESTED; + stgs_rta->rta_len = total_size; + + buf = RTA_DATA(stgs_rta); + for (i = 0; i < bpf_map_opts.nr_maps; i++) { + int *fd; + + fd_rta = buf; + fd_rta->rta_type = SK_DIAG_BPF_STORAGE_REQ_MAP_FD; + fd_rta->rta_len = RTA_LENGTH(sizeof(int)); + + fd = RTA_DATA(fd_rta); + *fd = bpf_map_opts.maps[i].fd; + + buf += fd_rta->rta_len; + } + + return stgs_rta; +} + +static void show_sk_bpf_storages(struct rtattr *bpf_stgs) +{ + struct rtattr *tb[SK_DIAG_BPF_STORAGE_MAX + 1], *bpf_stg; + unsigned int rem; + + for (bpf_stg = RTA_DATA(bpf_stgs), rem = RTA_PAYLOAD(bpf_stgs); + RTA_OK(bpf_stg, rem); bpf_stg = RTA_NEXT(bpf_stg, rem)) { + + if ((bpf_stg->rta_type & NLA_TYPE_MASK) != SK_DIAG_BPF_STORAGE) + continue; + + parse_rtattr_nested(tb, SK_DIAG_BPF_STORAGE_MAX, + (struct rtattr *)bpf_stg); + + if (tb[SK_DIAG_BPF_STORAGE_MAP_ID]) { + out("map_id:%u", + rta_getattr_u32(tb[SK_DIAG_BPF_STORAGE_MAP_ID])); + } + } +} + +#endif + static int inet_show_sock(struct nlmsghdr *nlh, struct sockstat *s) { @@ -3375,8 +3598,8 @@ static int inet_show_sock(struct nlmsghdr *nlh, struct inet_diag_msg *r = NLMSG_DATA(nlh); unsigned char v6only = 0; - parse_rtattr(tb, INET_DIAG_MAX, (struct rtattr *)(r+1), - nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*r))); + parse_rtattr_flags(tb, INET_DIAG_MAX, (struct rtattr *)(r+1), + nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*r)), NLA_F_NESTED); if (tb[INET_DIAG_PROTOCOL]) s->type = rta_getattr_u8(tb[INET_DIAG_PROTOCOL]); @@ -3473,6 +3696,11 @@ static int inet_show_sock(struct nlmsghdr *nlh, } sctp_ino = s->ino; + if (tb[INET_DIAG_SK_BPF_STORAGES]) { + field_set(COL_SKSTOR); + show_sk_bpf_storages(tb[INET_DIAG_SK_BPF_STORAGES]); + } + return 0; } @@ -3554,13 +3782,14 @@ static int sockdiag_send(int family, int fd, int protocol, struct filter *f) { struct sockaddr_nl nladdr = { .nl_family = AF_NETLINK }; DIAG_REQUEST(req, struct inet_diag_req_v2 r); + struct rtattr *bpf_stgs_rta = NULL; char *bc = NULL; int bclen; __u32 proto; struct msghdr msg; struct rtattr rta_bc; struct rtattr rta_proto; - struct iovec iov[5]; + struct iovec iov[6]; int iovlen = 1; if (family == PF_UNSPEC) @@ -3613,6 +3842,17 @@ static int sockdiag_send(int family, int fd, int protocol, struct filter *f) iovlen += 2; } + if (bpf_map_opts_is_enabled()) { + bpf_stgs_rta = bpf_map_opts_alloc_rta(); + if (!bpf_stgs_rta) { + fprintf(stderr, "ss: cannot alloc request for --bpf-map\n"); + return -1; + } + + iov[iovlen++] = (struct iovec){ bpf_stgs_rta, bpf_stgs_rta->rta_len }; + req.nlh.nlmsg_len += bpf_stgs_rta->rta_len; + } + msg = (struct msghdr) { .msg_name = (void *)&nladdr, .msg_namelen = sizeof(nladdr), @@ -3621,10 +3861,13 @@ static int sockdiag_send(int family, int fd, int protocol, struct filter *f) }; if (sendmsg(fd, &msg, 0) < 0) { + free(bpf_stgs_rta); close(fd); return -1; } + free(bpf_stgs_rta); + return 0; } @@ -5344,6 +5587,10 @@ static void _usage(FILE *dest) " --tos show tos and priority information\n" " --cgroup show cgroup information\n" " -b, --bpf show bpf filter socket information\n" +#ifdef HAVE_LIBBPF +" --bpf-maps show all BPF socket-local storage maps\n" +" --bpf-maps-id=MAP-ID show a BPF socket-local storage map\n" +#endif " -E, --events continually display sockets as they are destroyed\n" " -Z, --context display task SELinux security contexts\n" " -z, --contexts display task and socket SELinux security contexts\n" @@ -5460,6 +5707,9 @@ static int scan_state(const char *state) #define OPT_INET_SOCKOPT 262 +#define OPT_BPF_MAPS 263 +#define OPT_BPF_MAP_ID 264 + static const struct option long_opts[] = { { "numeric", 0, 0, 'n' }, { "resolve", 0, 0, 'r' }, @@ -5504,6 +5754,10 @@ static const struct option long_opts[] = { { "mptcp", 0, 0, 'M' }, { "oneline", 0, 0, 'O' }, { "inet-sockopt", 0, 0, OPT_INET_SOCKOPT }, +#ifdef HAVE_LIBBPF + { "bpf-maps", 0, 0, OPT_BPF_MAPS}, + { "bpf-map-id", 1, 0, OPT_BPF_MAP_ID}, +#endif { 0 } }; @@ -5706,6 +5960,16 @@ int main(int argc, char *argv[]) case OPT_INET_SOCKOPT: show_inet_sockopt = 1; break; +#ifdef HAVE_LIBBPF + case OPT_BPF_MAPS: + if (bpf_map_opts_add_all()) + exit(1); + break; + case OPT_BPF_MAP_ID: + if (bpf_map_opts_add_id(optarg)) + exit(1); + break; +#endif case 'h': help(); case '?': @@ -5804,6 +6068,9 @@ int main(int argc, char *argv[]) if (!(current_filter.states & (current_filter.states - 1))) columns[COL_STATE].disabled = 1; + if (bpf_map_opts.nr_maps) + columns[COL_SKSTOR].disabled = 0; + if (show_header) print_header(); From patchwork Tue Nov 28 02:30:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Quentin Deslandes X-Patchwork-Id: 13470563 X-Patchwork-Delegate: dsahern@gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=none X-Greylist: delayed 12603 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Mon, 27 Nov 2023 23:45:25 PST Received: from 7.mo547.mail-out.ovh.net (7.mo547.mail-out.ovh.net [46.105.53.191]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D0FDBD4D for ; Mon, 27 Nov 2023 23:45:25 -0800 (PST) Received: from ex4.mail.ovh.net (unknown [10.109.143.210]) by mo547.mail-out.ovh.net (Postfix) with ESMTPS id 4139920245; Tue, 28 Nov 2023 02:35:38 +0000 (UTC) Received: from bf-dev-miffies.localdomain (92.184.96.55) by DAG10EX1.indiv4.local (172.16.2.91) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Tue, 28 Nov 2023 03:32:15 +0100 From: Quentin Deslandes To: CC: David Ahern , Martin KaFai Lau , Quentin Deslandes Subject: [PATCH 3/3] ss: pretty-print BPF socket-local storage Date: Mon, 27 Nov 2023 18:30:58 -0800 Message-ID: <20231128023058.53546-4-qde@naccy.de> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231128023058.53546-1-qde@naccy.de> References: <20231128023058.53546-1-qde@naccy.de> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: CAS12.indiv4.local (172.16.1.12) To DAG10EX1.indiv4.local (172.16.2.91) X-Ovh-Tracer-Id: 5907033864805805736 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvkedrudeivddggeelucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecuhedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvvefufffkofgjfhgggfgtihesthekredtredttdenucfhrhhomhepsfhuvghnthhinhcuffgvshhlrghnuggvshcuoehquggvsehnrggttgihrdguvgeqnecuggftrfgrthhtvghrnheptdehffefgeevleejudehkeeigeegheeffefhleettdehtdehffduffduleetvedtnecuffhomhgrihhnpehnrhgpmhgrphhsrdhiugdpnhhrpghmrghpshdrihhnfhhonecukfhppeduvdejrddtrddtrddupdelvddrudekgedrleeirdehheenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepihhnvghtpeduvdejrddtrddtrddupdhmrghilhhfrhhomhepoehquggvsehnrggttgihrdguvgeqpdhnsggprhgtphhtthhopedupdhrtghpthhtohepnhgvthguvghvsehvghgvrhdrkhgvrhhnvghlrdhorhhgpdgushgrhhgvrhhnsehgmhgrihhlrdgtohhmpdhmrghrthhinhdrlhgruheskhgvrhhnvghlrdhorhhgpdfovfetjfhoshhtpehmohehgeejpdhmohguvgepshhmthhpohhuth ss is able to print the map ID(s) for which a given socket has BPF socket-local storage defined (using --bpf-maps or --bpf-map-id=). However, the actual content of the map remains hidden. This change aims to pretty-print the socket-local storage content following the socket details, similar to what `bpftool map dump` would do. The exact output format is inspired by drgn, while the BTF data processing is similar to bpftool's. ss will print the map's content in a best-effort fashion: BTF types that can be printed will be displayed, while types that are not yet supported (e.g. BTF_KIND_VAR) will be replaced by a placeholder. For readability reasons, the --oneline option is not compatible with this change. The new out_prefix_t type is introduced to ease the printing of compound types (e.g. structs, unions), it defines the prefix to print before the actual value to ensure the output is properly indented. COL_SKSTOR's header is replaced with an empty string, as it doesn't need to be printed anymore; it's used as a "virtual" column to refer to the socket-local storage dump, which will be printed under the socket information. The column's width is fixed to 1, so it doesn't mess up ss' output. ss' output remains unchanged unless --bpf-maps or --bpf-map-id= is used, in which case each socket containing BPF local storage will be followed by the content of the storage before the next socket's info is displayed. Signed-off-by: Quentin Deslandes --- misc/ss.c | 558 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 551 insertions(+), 7 deletions(-) diff --git a/misc/ss.c b/misc/ss.c index 5b255ce3..545e5475 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -51,8 +51,13 @@ #include #include +#ifdef HAVE_LIBBPF +#include +#endif + #ifdef HAVE_LIBBPF #include +#include #include #endif @@ -136,7 +141,7 @@ static struct column columns[] = { { ALIGN_RIGHT, "Peer Address:", " ", 0, 0, 0 }, { ALIGN_LEFT, "Port", "", 0, 0, 0 }, { ALIGN_LEFT, "Process", "", 0, 0, 0 }, - { ALIGN_LEFT, "Socket storage", "", 1, 0, 0 }, + { ALIGN_LEFT, "", "", 1, 0, 0 }, { ALIGN_LEFT, "", "", 0, 0, 0 }, }; @@ -1212,6 +1217,9 @@ static void render_calc_width(void) */ c->width = min(c->width, screen_width); + if (c == &columns[COL_SKSTOR]) + c->width = 1; + if (c->width) first = 0; } @@ -3386,6 +3394,8 @@ static struct bpf_map_opts { struct bpf_sk_storage_map_info { unsigned int id; int fd; + struct bpf_map_info info; + struct btf *btf; } maps[MAX_NR_BPF_MAP_ID_OPTS]; bool show_all; struct btf *kernel_btf; @@ -3397,6 +3407,32 @@ static void bpf_map_opts_mixed_error(void) "ss: --bpf-maps and --bpf-map-id cannot be used together\n"); } +static int bpf_maps_opts_load_btf(struct bpf_map_info *info, struct btf **btf) +{ + if (info->btf_vmlinux_value_type_id) { + if (!bpf_map_opts.kernel_btf) { + bpf_map_opts.kernel_btf = libbpf_find_kernel_btf(); + if (!bpf_map_opts.kernel_btf) { + fprintf(stderr, "ss: failed to load kernel BTF\n"); + return -1; + } + } + + *btf = bpf_map_opts.kernel_btf; + } else if (info->btf_value_type_id) { + *btf = btf__load_from_kernel_by_id(info->btf_id); + if (!*btf) { + fprintf(stderr, "ss: failed to load BTF for map ID %u\n", + info->id); + return -1; + } + } else { + *btf = NULL; + } + + return 0; +} + static int bpf_map_opts_add_all(void) { unsigned int i; @@ -3412,6 +3448,7 @@ static int bpf_map_opts_add_all(void) while (1) { struct bpf_map_info info = {}; uint32_t len = sizeof(info); + struct btf *btf; r = bpf_map_get_next_id(id, &id); if (r) { @@ -3450,8 +3487,18 @@ static int bpf_map_opts_add_all(void) continue; } + r = bpf_maps_opts_load_btf(&info, &btf); + if (r) { + fprintf(stderr, "ss: failed to get BTF data for BPF map ID: %u\n", + id); + close(fd); + goto err; + } + bpf_map_opts.maps[bpf_map_opts.nr_maps].id = id; - bpf_map_opts.maps[bpf_map_opts.nr_maps++].fd = fd; + bpf_map_opts.maps[bpf_map_opts.nr_maps].fd = fd; + bpf_map_opts.maps[bpf_map_opts.nr_maps].info = info; + bpf_map_opts.maps[bpf_map_opts.nr_maps++].btf = btf; } bpf_map_opts.show_all = true; @@ -3470,6 +3517,7 @@ static int bpf_map_opts_add_id(const char *optarg) struct bpf_map_info info = {}; uint32_t len = sizeof(info); size_t optarg_len; + struct btf *btf; unsigned long id; unsigned int i; char *end; @@ -3521,12 +3569,34 @@ static int bpf_map_opts_add_id(const char *optarg) return -1; } + r = bpf_maps_opts_load_btf(&info, &btf); + if (r) { + fprintf(stderr, "ss: failed to get BTF data for BPF map ID: %lu\n", + id); + return -1; + } + bpf_map_opts.maps[bpf_map_opts.nr_maps].id = id; - bpf_map_opts.maps[bpf_map_opts.nr_maps++].fd = fd; + bpf_map_opts.maps[bpf_map_opts.nr_maps].fd = fd; + bpf_map_opts.maps[bpf_map_opts.nr_maps].info = info; + bpf_map_opts.maps[bpf_map_opts.nr_maps++].btf = btf; return 0; } +static const struct bpf_sk_storage_map_info *bpf_map_opts_get_info( + unsigned int map_id) +{ + unsigned int i; + + for (i = 0; i < bpf_map_opts.nr_maps; ++i) { + if (bpf_map_opts.maps[i].id == map_id) + return &bpf_map_opts.maps[i]; + } + + return NULL; +} + static inline bool bpf_map_opts_is_enabled(void) { return bpf_map_opts.nr_maps; @@ -3568,10 +3638,472 @@ static struct rtattr *bpf_map_opts_alloc_rta(void) return stgs_rta; } +#define OUT_PREFIX_LEN 65 + +/* Print a prefixed formatted string. Used to dump BPF socket-local storage + * nested structures properly. */ +#define OUT_P(p, fmt, ...) out("%s" fmt, *(p), ##__VA_ARGS__) + +typedef char(out_prefix_t)[OUT_PREFIX_LEN]; + +static void out_prefix_push(out_prefix_t *prefix) +{ + size_t len = strlen(*prefix); + + if (len + 5 > OUT_PREFIX_LEN) + return; + + strncpy(&(*prefix)[len], " ", 5); +} + +static void out_prefix_pop(out_prefix_t *prefix) +{ + size_t len = strlen(*prefix); + + if (len < 4) + return; + + (*prefix)[len - 4] = '\0'; +} + +static inline const char *btf_typename_or_fallback(const struct btf *btf, + unsigned int name_off) +{ + static const char *fallback = ""; + static const char *anon = ""; + const char *typename; + + typename = btf__name_by_offset(btf, name_off); + if (!typename) + return fallback; + + if (strcmp(typename, "") == 0) + return anon; + + return typename; +} + +static void out_btf_int128(const struct btf *btf, const struct btf_type *type, + const void *data, out_prefix_t *prefix) +{ + uint64_t high, low; + const char *typename; + +#ifdef __BIG_ENDIAN_BITFIELD + high = *(uint64_t *)data; + low = *(uint64_t *)(data + 8); +#else + high = *(uint64_t *)(data + 8); + low = *(uint64_t *)data; +#endif + + typename = btf_typename_or_fallback(btf, type->name_off); + + if (high == 0) + OUT_P(prefix, "(%s)0x%lx,\n", typename, low); + else + OUT_P(prefix, "(%s)0x%lx%016lx,\n", typename, high, low); +} + +#define BITS_PER_BYTE_MASKED(bits) ((bits) & 7) +#define BITS_ROUNDDOWN_BYTES(bits) ((bits) >> 3) +#define BITS_ROUNDUP_BYTES(bits) \ + (BITS_ROUNDDOWN_BYTES(bits) + !!BITS_PER_BYTE_MASKED(bits)) + +static void out_btf_bitfield(const struct btf *btf, const struct btf_type *type, + uint32_t bitfield_offset, uint8_t bitfield_size, const void *data, + out_prefix_t *prefix) +{ + int left_shift_bits, right_shift_bits; + uint64_t high, low; + uint64_t print_num[2] = {}; + int bits_to_copy; + const char *typename; + + bits_to_copy = bitfield_offset + bitfield_size; + memcpy(print_num, data, BITS_ROUNDUP_BYTES(bits_to_copy)); + + right_shift_bits = 128 - bitfield_size; +#if defined(__BIG_ENDIAN_BITFIELD) + high = print_num[0]; + low = print_num[1]; + left_shift_bits = bitfield_offset; +#elif defined(__LITTLE_ENDIAN_BITFIELD) + high = print_num[1]; + low = print_num[0]; + left_shift_bits = 128 - bits_to_copy; +#else +#error neither big nor little endian +#endif + + /* shake out un-needed bits by shift/or operations */ + if (left_shift_bits >= 64) { + high = low << (left_shift_bits - 64); + low = 0; + } else { + high = (high << left_shift_bits) | (low >> (64 - left_shift_bits)); + low = low << left_shift_bits; + } + + if (right_shift_bits >= 64) { + low = high >> (right_shift_bits - 64); + high = 0; + } else { + low = (low >> right_shift_bits) | (high << (64 - right_shift_bits)); + high = high >> right_shift_bits; + } + + typename = btf_typename_or_fallback(btf, type->name_off); + + if (high == 0) { + OUT_P(prefix, "(%s:%d)0x%lx,\n", typename, bitfield_size, low); + } else { + OUT_P(prefix, "(%s:%d)0x%lx%016lx,\n", typename, bitfield_size, + high, low); + } +} + +static void out_btf_int(const struct btf *btf, const struct btf_type *type, + uint32_t bit_offset, const void *data, out_prefix_t *prefix) +{ + uint32_t *int_type = (uint32_t *)(type + 1); + uint32_t nbits = BTF_INT_BITS(*int_type); + const char *typename; + + typename = btf_typename_or_fallback(btf, type->name_off); + + if (bit_offset || BTF_INT_OFFSET(*int_type) || + BITS_PER_BYTE_MASKED(nbits)) { + out_btf_bitfield(btf, type, BTF_INT_OFFSET(*int_type), nbits, + data, prefix); + return; + } + + if (nbits == 128) { + out_btf_int128(btf, type, data, prefix); + return; + } + + switch (BTF_INT_ENCODING(*int_type)) { + case 0: + if (BTF_INT_BITS(*int_type) == 64) + OUT_P(prefix, "(%s)%lu,\n", typename, *(uint64_t *)data); + else if (BTF_INT_BITS(*int_type) == 32) + OUT_P(prefix, "(%s)%u,\n", typename, *(uint32_t *)data); + else if (BTF_INT_BITS(*int_type) == 16) + OUT_P(prefix, "(%s)%hu,\n", typename, *(uint16_t *)data); + else if (BTF_INT_BITS(*int_type) == 8) + OUT_P(prefix, "(%s)%hhu,\n", typename, *(uint8_t *)data); + else + OUT_P(prefix, ","); + break; + case BTF_INT_SIGNED: + if (BTF_INT_BITS(*int_type) == 64) + OUT_P(prefix, "(%s)%ld,\n", typename, *(int64_t *)data); + else if (BTF_INT_BITS(*int_type) == 32) + OUT_P(prefix, "(%s)%d,\n", typename, *(int32_t *)data); + else if (BTF_INT_BITS(*int_type) == 16) + OUT_P(prefix, "(%s)%hd,\n", typename, *(int16_t *)data); + else if (BTF_INT_BITS(*int_type) == 8) + OUT_P(prefix, "(%s)%hhd,\n", typename, *(int8_t *)data); + else + OUT_P(prefix, ","); + break; + case BTF_INT_CHAR: + OUT_P(prefix, "(%s)0x%hhx,\n", typename, *(char *)data); + break; + case BTF_INT_BOOL: + OUT_P(prefix, "(%s)%s,\n", typename, + *(bool *)data ? "true" : "false"); + break; + default: + OUT_P(prefix, ",\n"); + break; + } +} + +static void out_btf_ptr(const struct btf *btf, const struct btf_type *type, + const void *data, out_prefix_t *prefix) +{ + unsigned long value = *(unsigned long *)data; + int actual_type_id; + const struct btf_type *actual_type; + const char *typename = NULL; + + actual_type_id = btf__resolve_type(btf, type->type); + if (actual_type_id > 0) { + actual_type = btf__type_by_id(btf, actual_type_id); + if (actual_type) + typename = btf__name_by_offset(btf, actual_type->name_off); + } + + typename = typename ? : "void"; + + OUT_P(prefix, "(%s *)%p,\n", typename, (void *)value); +} + +static void out_btf_dump_type(const struct btf *btf, int bit_offset, + uint32_t type_id, const void *data, size_t len, out_prefix_t *prefix); + +static void out_btf_array(const struct btf *btf, const struct btf_type *type, + const void *data, out_prefix_t *prefix) +{ + const struct btf_array *array = (struct btf_array *)(type + 1); + const struct btf_type *elem_type; + long long elem_size; + + elem_type = btf__type_by_id(btf, array->type); + if (!elem_type) { + OUT_P(prefix, ",\n", array->type); + return; + } + + elem_size = btf__resolve_size(btf, array->type); + if (elem_size < 0) { + OUT_P(prefix, ",\n", array->type); + return; + } + + for (int i = 0; i < array->nelems; ++i) { + out_btf_dump_type(btf, 0, array->type, data + i * elem_size, + elem_size, prefix); + } +} + +static void out_btf_struct(const struct btf *btf, const struct btf_type *type, + const void *data, out_prefix_t *prefix) +{ + struct btf_member *member = (struct btf_member *)(type + 1); + const struct btf_type *member_type; + const void *member_data; + out_prefix_t prefix_override = {}; + unsigned int i; + + for (i = 0; i < BTF_INFO_VLEN(type->info); i++) { + uint32_t bitfield_offset = member[i].offset; + uint32_t bitfield_size = 0; + + if (BTF_INFO_KFLAG(type->info)) { + /* If btf_type.info.kind_flag is set, then + * btf_member.offset is composed of: + * bitfield_offset << 24 | bitfield_size + */ + bitfield_size = BTF_MEMBER_BITFIELD_SIZE(bitfield_offset); + bitfield_offset = BTF_MEMBER_BIT_OFFSET(bitfield_offset); + } + + OUT_P(prefix, ".%s = ", + btf_typename_or_fallback(btf, member[i].name_off)); + + /* The prefix has to be overwritten as this function prints the + * field's name, so we don't print the prefix once here before + * the name, then again in out_btf_bitfield() or out_btf_int() + * before printing the actual value on the same line. */ + + member_type = btf__type_by_id(btf, member[i].type); + if (!member_type) { + OUT_P(&prefix_override, ",\n", + member[i].type); + return; + } + + member_data = data + BITS_ROUNDDOWN_BYTES(bitfield_offset); + bitfield_offset = BITS_PER_BYTE_MASKED(bitfield_offset); + + if (bitfield_size) { + out_btf_bitfield(btf, member_type, bitfield_offset, + bitfield_size, member_data, &prefix_override); + } else { + out_btf_dump_type(btf, bitfield_offset, member[i].type, + member_data, 0, &prefix_override); + } + } +} + +static void out_btf_enum(const struct btf *btf, const struct btf_type *type, + const void *data, out_prefix_t *prefix) +{ + const struct btf_enum *enums = (struct btf_enum *)(type + 1); + int64_t value; + unsigned int i; + + switch (type->size) { + case 8: + value = *(int64_t *)data; + break; + case 4: + value = *(int32_t *)data; + break; + case 2: + value = *(int16_t*)data; + break; + case 1: + value = *(int8_t *)data; + break; + default: + OUT_P(prefix, ",\n", type->size); + return; + } + + for (i = 0; BTF_INFO_VLEN(type->info); ++i) { + if (value == enums[i].val) { + OUT_P(prefix, "(enum %s)%s\n", + btf_typename_or_fallback(btf, type->name_off), + btf_typename_or_fallback(btf, enums[i].name_off)); + return; + } + } +} + +static void out_btf_enum64(const struct btf *btf, const struct btf_type *type, + const void *data, out_prefix_t *prefix) +{ + const struct btf_enum64 *enums = (struct btf_enum64 *)(type + 1); + uint32_t lo32, hi32; + uint64_t value; + unsigned int i; + + value = *(uint64_t *)data; + lo32 = (uint32_t)value; + hi32 = value >> 32; + + for (i = 0; i < BTF_INFO_VLEN(type->info); i++) { + if (lo32 == enums[i].val_lo32 && hi32 == enums[i].val_hi32) { + OUT_P(prefix, "(enum %s)%s\n", + btf_typename_or_fallback(btf, type->name_off), + btf__name_by_offset(btf, enums[i].name_off)); + return; + } + } +} + +static out_prefix_t out_global_prefix = {}; + +static void out_btf_dump_type(const struct btf *btf, int bit_offset, + uint32_t type_id, const void *data, size_t len, out_prefix_t *prefix) +{ + const struct btf_type *type; + out_prefix_t *global_prefix = &out_global_prefix; + + if (!btf) { + OUT_P(prefix, ",\n"); + return; + } + + type = btf__type_by_id(btf, type_id); + if (!type) { + OUT_P(prefix, ",\n", type_id); + return; + } + + switch (BTF_INFO_KIND(type->info)) { + case BTF_KIND_UNION: + case BTF_KIND_STRUCT: + OUT_P(prefix, "(%s %s) {\n", + BTF_INFO_KIND(type->info) == BTF_KIND_STRUCT ? "struct" : "union", + btf_typename_or_fallback(btf, type->name_off)); + + out_prefix_push(global_prefix); + out_btf_struct(btf, type, data, global_prefix); + out_prefix_pop(global_prefix); + OUT_P(global_prefix, "},\n"); + break; + case BTF_KIND_ARRAY: + { + struct btf_array *array = (struct btf_array *)(type + 1); + const struct btf_type *content_type = btf__type_by_id(btf, array->type); + + if (!content_type) { + OUT_P(prefix, ",\n", array->type); + return; + } + + OUT_P(prefix, "(%s[]) {\n", + btf_typename_or_fallback(btf, content_type->name_off)); + out_prefix_push(global_prefix); + out_btf_array(btf, type, data, global_prefix); + out_prefix_pop(global_prefix); + OUT_P(global_prefix, "},\n"); + } + break; + case BTF_KIND_TYPEDEF: + case BTF_KIND_VOLATILE: + case BTF_KIND_CONST: + case BTF_KIND_RESTRICT: + { + int actual_type_id = btf__resolve_type(btf, type_id); + + if (actual_type_id < 0) { + OUT_P(prefix, ",\n", type_id); + return; + } + + return out_btf_dump_type(btf, 0, actual_type_id, data, + len, prefix); + } + break; + case BTF_KIND_INT: + out_btf_int(btf, type, bit_offset, data, prefix); + break; + case BTF_KIND_PTR: + out_btf_ptr(btf, type, data, prefix); + break; + case BTF_KIND_ENUM: + out_btf_enum(btf, type, data, prefix); + break; + case BTF_KIND_ENUM64: + out_btf_enum64(btf, type, data, prefix); + break; + case BTF_KIND_FWD: + OUT_P(prefix, ",\n"); + break; + case BTF_KIND_UNKN: + OUT_P(prefix, ",\n"); + break; + case BTF_KIND_VAR: + case BTF_KIND_DATASEC: + default: + OUT_P(prefix, ",\n", + BTF_INFO_KIND(type->info)); + break; + } +} + +static void out_bpf_sk_storage(int map_id, const void *data, size_t len, + out_prefix_t *prefix) +{ + uint32_t type_id; + struct bpf_sk_storage_map_info *map_info; + + map_info = bpf_map_opts_get_info(map_id); + if (!map_info) { + OUT_P(prefix, "map_id: %d: missing map info", map_id); + return; + } + + if (map_info->info.value_size != len) { + OUT_P(prefix, "map_id: %d: invalid value size, expecting %u, got %lu\n", + map_id, map_info->info.value_size, len); + return; + } + + type_id = map_info->info.btf_vmlinux_value_type_id ?: map_info->info.btf_value_type_id; + + OUT_P(prefix, "map_id: %d [\n", map_id); + out_prefix_push(prefix); + + out_btf_dump_type(map_info->btf, 0, type_id, data, len, prefix); + + out_prefix_pop(prefix); + OUT_P(prefix, "]"); +} + static void show_sk_bpf_storages(struct rtattr *bpf_stgs) { - struct rtattr *tb[SK_DIAG_BPF_STORAGE_MAX + 1], *bpf_stg; - unsigned int rem; + struct rtattr *tb[SK_DIAG_BPF_STORAGE_MAX+1], *bpf_stg; + out_prefix_t *global_prefix = &out_global_prefix; + unsigned int rem, map_id; + struct rtattr *value; for (bpf_stg = RTA_DATA(bpf_stgs), rem = RTA_PAYLOAD(bpf_stgs); RTA_OK(bpf_stg, rem); bpf_stg = RTA_NEXT(bpf_stg, rem)) { @@ -3583,8 +4115,15 @@ static void show_sk_bpf_storages(struct rtattr *bpf_stgs) (struct rtattr *)bpf_stg); if (tb[SK_DIAG_BPF_STORAGE_MAP_ID]) { - out("map_id:%u", - rta_getattr_u32(tb[SK_DIAG_BPF_STORAGE_MAP_ID])); + out("\n"); + + map_id = rta_getattr_u32(tb[SK_DIAG_BPF_STORAGE_MAP_ID]); + value = tb[SK_DIAG_BPF_STORAGE_MAP_VALUE]; + + out_prefix_push(global_prefix); + out_bpf_sk_storage(map_id, RTA_DATA(value), + RTA_PAYLOAD(value), global_prefix); + out_prefix_pop(global_prefix); } } } @@ -5978,6 +6517,11 @@ int main(int argc, char *argv[]) } } + if (oneline && (bpf_map_opts.nr_maps || bpf_map_opts.show_all)) { + fprintf(stderr, "ss: --oneline, --bpf-maps, and --bpf-map-id are incompatible\n"); + exit(-1); + } + if (show_processes || show_threads || show_proc_ctx || show_sock_ctx) user_ent_hash_build();