Message ID | 20220420122307.5290-1-xiangxia.m.yue@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next,v1] bpf: add bpf_ktime_get_real_ns helper | expand |
xiangxia.m.yue@gmail.com writes: > From: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > This patch introduce a new bpf_ktime_get_real_ns helper, which may > help us to measure the skb latency in the ingress/forwarding path: > > HW/SW[1] -> ip_rcv/tcp_rcv_established -> tcp_recvmsg_locked/tcp_update_recv_tstamps > > * Insert BPF kprobe into ip_rcv/tcp_rcv_established invoking this helper. > Then we can inspect how long time elapsed since HW/SW. > * If inserting BPF kprobe tcp_update_recv_tstamps invoked by tcp_recvmsg, > we can measure how much latency skb in tcp receive queue. The reason for > this can be application fetch the TCP messages too late. Why not just use one of the existing ktime helpers and also add a BPF probe to set the initial timestamp instead of relying on skb->tstamp? -Toke
On Wed, Apr 20, 2022 at 5:53 AM Toke Høiland-Jørgensen <toke@kernel.org> wrote: > > xiangxia.m.yue@gmail.com writes: > > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > > > This patch introduce a new bpf_ktime_get_real_ns helper, which may > > help us to measure the skb latency in the ingress/forwarding path: > > > > HW/SW[1] -> ip_rcv/tcp_rcv_established -> tcp_recvmsg_locked/tcp_update_recv_tstamps > > > > * Insert BPF kprobe into ip_rcv/tcp_rcv_established invoking this helper. > > Then we can inspect how long time elapsed since HW/SW. > > * If inserting BPF kprobe tcp_update_recv_tstamps invoked by tcp_recvmsg, > > we can measure how much latency skb in tcp receive queue. The reason for > > this can be application fetch the TCP messages too late. > > Why not just use one of the existing ktime helpers and also add a BPF > probe to set the initial timestamp instead of relying on skb->tstamp? > You don't even need a BPF probe for this. See [0] for how retsnoop is converting bpf_ktime_get_ns() into real time. [0] https://github.com/anakryiko/retsnoop/blob/master/src/retsnoop.c#L649-L668 > -Toke
Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > On Wed, Apr 20, 2022 at 5:53 AM Toke Høiland-Jørgensen <toke@kernel.org> wrote: >> >> xiangxia.m.yue@gmail.com writes: >> >> > From: Tonghao Zhang <xiangxia.m.yue@gmail.com> >> > >> > This patch introduce a new bpf_ktime_get_real_ns helper, which may >> > help us to measure the skb latency in the ingress/forwarding path: >> > >> > HW/SW[1] -> ip_rcv/tcp_rcv_established -> tcp_recvmsg_locked/tcp_update_recv_tstamps >> > >> > * Insert BPF kprobe into ip_rcv/tcp_rcv_established invoking this helper. >> > Then we can inspect how long time elapsed since HW/SW. >> > * If inserting BPF kprobe tcp_update_recv_tstamps invoked by tcp_recvmsg, >> > we can measure how much latency skb in tcp receive queue. The reason for >> > this can be application fetch the TCP messages too late. >> >> Why not just use one of the existing ktime helpers and also add a BPF >> probe to set the initial timestamp instead of relying on skb->tstamp? >> > > You don't even need a BPF probe for this. See [0] for how retsnoop is > converting bpf_ktime_get_ns() into real time. > > [0] https://github.com/anakryiko/retsnoop/blob/master/src/retsnoop.c#L649-L668 Uh, neat! Thanks for the link :) -Toke
On Wed, Apr 20, 2022 at 8:53 PM Toke Høiland-Jørgensen <toke@kernel.org> wrote: > > xiangxia.m.yue@gmail.com writes: > > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > > > This patch introduce a new bpf_ktime_get_real_ns helper, which may > > help us to measure the skb latency in the ingress/forwarding path: > > > > HW/SW[1] -> ip_rcv/tcp_rcv_established -> tcp_recvmsg_locked/tcp_update_recv_tstamps > > > > * Insert BPF kprobe into ip_rcv/tcp_rcv_established invoking this helper. > > Then we can inspect how long time elapsed since HW/SW. > > * If inserting BPF kprobe tcp_update_recv_tstamps invoked by tcp_recvmsg, > > we can measure how much latency skb in tcp receive queue. The reason for > > this can be application fetch the TCP messages too late. > > Why not just use one of the existing ktime helpers and also add a BPF > probe to set the initial timestamp instead of relying on skb->tstamp? Yes, That also looks good to me. > -Toke
On Thu, Apr 21, 2022 at 12:17 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Wed, Apr 20, 2022 at 5:53 AM Toke Høiland-Jørgensen <toke@kernel.org> wrote: > > > > xiangxia.m.yue@gmail.com writes: > > > > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > > > > > This patch introduce a new bpf_ktime_get_real_ns helper, which may > > > help us to measure the skb latency in the ingress/forwarding path: > > > > > > HW/SW[1] -> ip_rcv/tcp_rcv_established -> tcp_recvmsg_locked/tcp_update_recv_tstamps > > > > > > * Insert BPF kprobe into ip_rcv/tcp_rcv_established invoking this helper. > > > Then we can inspect how long time elapsed since HW/SW. > > > * If inserting BPF kprobe tcp_update_recv_tstamps invoked by tcp_recvmsg, > > > we can measure how much latency skb in tcp receive queue. The reason for > > > this can be application fetch the TCP messages too late. > > > > Why not just use one of the existing ktime helpers and also add a BPF > > probe to set the initial timestamp instead of relying on skb->tstamp? > > > > You don't even need a BPF probe for this. See [0] for how retsnoop is > converting bpf_ktime_get_ns() into real time. > > [0] https://github.com/anakryiko/retsnoop/blob/master/src/retsnoop.c#L649-L668 I try to calculate this offset too. But one case: If administrator manually or NTP changes the clock, we should calculate the offset. How do we know the changes, one solution is that inserting kprobe in tk_set_wall_to_mono() kernel function, and using perf_event to notify userspace. > > -Toke
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index d14b10b85e51..2565c587fe1b 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -5143,6 +5143,18 @@ union bpf_attr { * The **hash_algo** is returned on success, * **-EOPNOTSUP** if the hash calculation failed or **-EINVAL** if * invalid arguments are passed. + * + * u64 bpf_ktime_get_real_ns(void) + * Description + * Return a fine-grained version of the real (i.e., wall-clock) time, + * in nanoseconds. This clock is affected by discontinuous jumps in + * the system time (e.g., if the system administrator manually changes + * the clock), and by the incremental adjustments performed by adjtime(3) + * and NTP. + * See: **clock_gettime**\ (**CLOCK_REALTIME**) + * Return + * Current *ktime*. + * */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -5339,6 +5351,7 @@ union bpf_attr { FN(copy_from_user_task), \ FN(skb_set_tstamp), \ FN(ima_file_hash), \ + FN(ktime_get_real_ns), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 13e9dbeeedf3..acdf538b1dcd 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2627,6 +2627,7 @@ const struct bpf_func_proto bpf_get_prandom_u32_proto __weak; const struct bpf_func_proto bpf_get_smp_processor_id_proto __weak; const struct bpf_func_proto bpf_get_numa_node_id_proto __weak; const struct bpf_func_proto bpf_ktime_get_ns_proto __weak; +const struct bpf_func_proto bpf_ktime_get_real_ns_proto __weak; const struct bpf_func_proto bpf_ktime_get_boot_ns_proto __weak; const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto __weak; diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 315053ef6a75..d38548ed292f 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -159,6 +159,18 @@ const struct bpf_func_proto bpf_ktime_get_ns_proto = { .ret_type = RET_INTEGER, }; +BPF_CALL_0(bpf_ktime_get_real_ns) +{ + /* NMI safe access to clock realtime. */ + return ktime_get_real_fast_ns(); +} + +const struct bpf_func_proto bpf_ktime_get_real_ns_proto = { + .func = bpf_ktime_get_real_ns, + .gpl_only = false, + .ret_type = RET_INTEGER, +}; + BPF_CALL_0(bpf_ktime_get_boot_ns) { /* NMI safe access to clock boottime */ @@ -1410,6 +1422,8 @@ bpf_base_func_proto(enum bpf_func_id func_id) return &bpf_ktime_get_ns_proto; case BPF_FUNC_ktime_get_boot_ns: return &bpf_ktime_get_boot_ns_proto; + case BPF_FUNC_ktime_get_real_ns: + return &bpf_ktime_get_real_ns_proto; case BPF_FUNC_ringbuf_output: return &bpf_ringbuf_output_proto; case BPF_FUNC_ringbuf_reserve: diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index d14b10b85e51..2565c587fe1b 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -5143,6 +5143,18 @@ union bpf_attr { * The **hash_algo** is returned on success, * **-EOPNOTSUP** if the hash calculation failed or **-EINVAL** if * invalid arguments are passed. + * + * u64 bpf_ktime_get_real_ns(void) + * Description + * Return a fine-grained version of the real (i.e., wall-clock) time, + * in nanoseconds. This clock is affected by discontinuous jumps in + * the system time (e.g., if the system administrator manually changes + * the clock), and by the incremental adjustments performed by adjtime(3) + * and NTP. + * See: **clock_gettime**\ (**CLOCK_REALTIME**) + * Return + * Current *ktime*. + * */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -5339,6 +5351,7 @@ union bpf_attr { FN(copy_from_user_task), \ FN(skb_set_tstamp), \ FN(ima_file_hash), \ + FN(ktime_get_real_ns), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper