Message ID | 20231114-feature_ptp_netnext-v7-15-472e77951e40@bootlin.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 152c75e1d00200edc4da1beb67dd099a462ea86b |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: Make timestamping selectable | expand |
On Tue, 14 Nov 2023 12:28:43 +0100 Kory Maincent wrote: > + if (!tb[ETHTOOL_A_TS_LAYER]) > + return 0; GENL_REQ_ATTR_CHECK(), not sure why anyone would issue this command without any useful attr. > + /* Disable time stamping in the current layer. */ > + if (netif_device_present(dev) && > + (dev->ts_layer == PHY_TIMESTAMPING || > + dev->ts_layer == MAC_TIMESTAMPING)) { > + ret = dev_set_hwtstamp_phylib(dev, &config, info->extack); > + if (ret < 0) > + return ret; So you only support PHYLIB? The semantics need to be better documented :(
On Sat, 18 Nov 2023 18:34:33 -0800 Jakub Kicinski <kuba@kernel.org> wrote: > On Tue, 14 Nov 2023 12:28:43 +0100 Kory Maincent wrote: > > + if (!tb[ETHTOOL_A_TS_LAYER]) > > + return 0; > > GENL_REQ_ATTR_CHECK(), not sure why anyone would issue this command > without any useful attr. > > > + /* Disable time stamping in the current layer. */ > > + if (netif_device_present(dev) && > > + (dev->ts_layer == PHY_TIMESTAMPING || > > + dev->ts_layer == MAC_TIMESTAMPING)) { > > + ret = dev_set_hwtstamp_phylib(dev, &config, info->extack); > > + if (ret < 0) > > + return ret; > > So you only support PHYLIB? > > The semantics need to be better documented :( Yes as we don't really know how each MAC deal with the timestamping before ndo_hwstamp_get/set. Using phylib only allows us to be sure these NDO are implemented and the management of timestamping is coherent in the MAC. Also It will push people to move on to these NDOs. Ok I will add documentation.
Hi Köry, On Mon, Nov 20, 2023 at 10:44:39AM +0100, Köry Maincent wrote: > On Sat, 18 Nov 2023 18:34:33 -0800 > Jakub Kicinski <kuba@kernel.org> wrote: > > > On Tue, 14 Nov 2023 12:28:43 +0100 Kory Maincent wrote: > > > + if (!tb[ETHTOOL_A_TS_LAYER]) > > > + return 0; > > > > GENL_REQ_ATTR_CHECK(), not sure why anyone would issue this command > > without any useful attr. > > > > > + /* Disable time stamping in the current layer. */ > > > + if (netif_device_present(dev) && > > > + (dev->ts_layer == PHY_TIMESTAMPING || > > > + dev->ts_layer == MAC_TIMESTAMPING)) { > > > + ret = dev_set_hwtstamp_phylib(dev, &config, info->extack); > > > + if (ret < 0) > > > + return ret; > > > > So you only support PHYLIB? > > > > The semantics need to be better documented :( > > Yes as we don't really know how each MAC deal with the timestamping > before ndo_hwstamp_get/set. Using phylib only allows us to be sure these NDO are > implemented and the management of timestamping is coherent in the MAC. Also It > will push people to move on to these NDOs. > > Ok I will add documentation. > > -- > Köry Maincent, Bootlin > Embedded Linux and kernel engineering > https://bootlin.com/ When Jakub says "the semantics need to be better documented", I'm also thinking of a different direction. From what I understand, Maxime is working on representing multiple phylib PHYs in the UAPI: https://patchwork.kernel.org/project/netdevbpf/cover/20231117162323.626979-1-maxime.chevallier@bootlin.com/ Does your UAPI proposal make it possible in any way to select timestamping in phylib PHY A rather than PHY B? Or do you think it is extensible to support that, somehow?
On Mon, 20 Nov 2023 12:52:55 +0200 Vladimir Oltean <vladimir.oltean@nxp.com> wrote: > > > So you only support PHYLIB? > > > > > > The semantics need to be better documented :( > > > > Yes as we don't really know how each MAC deal with the timestamping > > before ndo_hwstamp_get/set. Using phylib only allows us to be sure these > > NDO are implemented and the management of timestamping is coherent in the > > MAC. Also It will push people to move on to these NDOs. > > > > Ok I will add documentation. > > When Jakub says "the semantics need to be better documented", I'm also > thinking of a different direction. > > From what I understand, Maxime is working on representing multiple > phylib PHYs in the UAPI: > https://patchwork.kernel.org/project/netdevbpf/cover/20231117162323.626979-1-maxime.chevallier@bootlin.com/ Yes I am also following his patch series. > Does your UAPI proposal make it possible in any way to select > timestamping in phylib PHY A rather than PHY B? Or do you think it is > extensible to support that, somehow? It does not support it for now. I didn't want to base my work on his series as it could work without it for now and I didn't want to wait to have his series accepted. It is more a future possible support as I don't have anything to test it and I don't know if such hardware exists right now. I think it will be extensible to support that, my thinking was to create this struct in net_device struct: struct { enum layer; u32 id; } ts; With id saving the phy_index of the PHY X used when the layer PHY is selected. This id could also be used to store the timestamp point in case of several timestamp in a MAC. Regards,
On Mon, Nov 20, 2023 at 12:14:40PM +0100, Köry Maincent wrote: > > Does your UAPI proposal make it possible in any way to select > > timestamping in phylib PHY A rather than PHY B? Or do you think it is > > extensible to support that, somehow? > > It does not support it for now. > I didn't want to base my work on his series as it could work without it for now > and I didn't want to wait to have his series accepted. It is more a future > possible support as I don't have anything to test it and I don't know if such > hardware exists right now. > I think it will be extensible to support that, my thinking was to create this > struct in net_device struct: > > struct { > enum layer; > u32 id; > } ts; > > With id saving the phy_index of the PHY X used when the layer PHY is selected. > This id could also be used to store the timestamp point in case of several > timestamp in a MAC. Ok, and I suppose the "u32 id" would be numerically the same as the ETHTOOL_A_HEADER_PHY_INDEX nlattr that Maxime is proposing? The next question would be: if a driver performs PHY management in firmware, and does not use phylib, how should user space interact with it? What timestamping layer and upon what should the ID be chosen? Finally (and unrelated to the question above), why is SOFTWARE_TIMESTAMPING even a layer exposed in the UAPI? My understanding of this patch set is that it is meant to select the source of hardware timestamps that are given to a socket. What gap in the UAPI does the introduction of a SOFTWARE_TIMESTAMPING hwtstamping layer cover?
On Mon, 20 Nov 2023 14:06:01 +0200 Vladimir Oltean <vladimir.oltean@nxp.com> wrote: > On Mon, Nov 20, 2023 at 12:14:40PM +0100, Köry Maincent wrote: > > > Does your UAPI proposal make it possible in any way to select > > > timestamping in phylib PHY A rather than PHY B? Or do you think it is > > > extensible to support that, somehow? > > > > It does not support it for now. > > I didn't want to base my work on his series as it could work without it for > > now and I didn't want to wait to have his series accepted. It is more a > > future possible support as I don't have anything to test it and I don't > > know if such hardware exists right now. > > I think it will be extensible to support that, my thinking was to create > > this struct in net_device struct: > > > > struct { > > enum layer; > > u32 id; > > } ts; > > > > With id saving the phy_index of the PHY X used when the layer PHY is > > selected. This id could also be used to store the timestamp point in case > > of several timestamp in a MAC. > > Ok, and I suppose the "u32 id" would be numerically the same as the > ETHTOOL_A_HEADER_PHY_INDEX nlattr that Maxime is proposing? Yes. > The next question would be: if a driver performs PHY management in > firmware, and does not use phylib, how should user space interact with it? > What timestamping layer and upon what should the ID be chosen? In that case it could be the second options I refereed to. Using the id to select the right timestamp within the NIC driver. It indeed won't be called PHY timestamping as it is managed by the NIC firmware but as it is managed by only one firmware and driver using the id to separate the available timestamp seems a good idea. Another solution would be to create another value in the layer enumeration. PHY_NIC_TIMESTAMPING? Better idea? I am not good at naming. > Finally (and unrelated to the question above), why is SOFTWARE_TIMESTAMPING > even a layer exposed in the UAPI? My understanding of this patch set is > that it is meant to select the source of hardware timestamps that are > given to a socket. What gap in the UAPI does the introduction of a > SOFTWARE_TIMESTAMPING hwtstamping layer cover? As I explained to Jakub: The software timestamping comes from the MAC driver capabilities and I decided to separate software and MAC timestamping. If we select PHY timestamping we can't use software timestamping and for an user, selecting the MAC as timestamping seems not logical to use software timestamping (I got confused myself when I first dig into it long time ago). Be able to select directly Software timestamping seems appropriate and won't bring any harm. What do you think? Regards,
On Mon, Nov 20, 2023 at 02:49:29PM +0100, Köry Maincent wrote: > > The next question would be: if a driver performs PHY management in > > firmware, and does not use phylib, how should user space interact with it? > > What timestamping layer and upon what should the ID be chosen? > > In that case it could be the second options I refereed to. > Using the id to select the right timestamp within the NIC driver. > It indeed won't be called PHY timestamping as it is managed by the NIC firmware > but as it is managed by only one firmware and driver using the id to separate > the available timestamp seems a good idea. > > Another solution would be to create another value in the layer enumeration. > PHY_NIC_TIMESTAMPING? Better idea? I am not good at naming. The point I was trying to make is that your current choice of exposing PHY_TIMESTAMPING in UAPI, when it really only refers to phylib PHYs, would lead exactly to this sort of UAPI balkanization where everyone wants to add more timestamping layers, and to define IDs to be specific to their own invented layer. Maybe the concept of timestamping layers is not what user space should see at all. In previous email discussions, I was proposing to Jakub and you "what if we didn't let user space select a specific layer like PHY_TIMESTAMPING or MAC_TIMESTAMPING at all, but just select a specific phc_index as the provider of hardware timestamps"? The limitation we're trying to lift is that currently, there can be only a single provider of hardware timestamps. We make that provider customizable. There is already a good understanding from user space that, if "ethtool -T" on an interface says there is no PHC, then there are going to be no hardware timestamps. So I thought it would be much more intuitive if the timestamping layer could be selected by the user merely by an unified phc_index (provided by a phylib phy or firmware based driver or whatever), and everything else would just be an implementation detail of the kernel. No one should care that it's a phylib phy, and shouldn't use a different procedure to identify its ID based on whether it's a phylib or firmware PHY. It's a bit hard to align my expectation of what this series should offer with yours. I think we're talking past each other, which unfortunately makes me lose track and interest. I wish you could have answered my earlier question about this alternative proposal. https://lore.kernel.org/netdev/20231013170903.p3ycicebnfrsmoks@skbuf/ > > Finally (and unrelated to the question above), why is SOFTWARE_TIMESTAMPING > > even a layer exposed in the UAPI? My understanding of this patch set is > > that it is meant to select the source of hardware timestamps that are > > given to a socket. What gap in the UAPI does the introduction of a > > SOFTWARE_TIMESTAMPING hwtstamping layer cover? > > As I explained to Jakub: > The software timestamping comes from the MAC driver capabilities and I decided > to separate software and MAC timestamping. Why? What was the problem? This confuses me because I don't understand what is the problem that the solution is trying to address, and whether the solution is orthogonal to all the other UAPI that exists for software and hardware timestamping at the socket layer - which AFAIK can happily coexist. > If we select PHY timestamping we can't use software timestamping and > for an user, selecting the MAC as timestamping seems not logical to > use software timestamping (I got confused myself when I first dig into > it long time ago). Be able to select directly Software timestamping > seems appropriate and won't bring any harm. What do you think? Hmm, can you please explain what is the reason why software timestamping can't coexist with PHY timestamping? It is a genuine question to which I don't have an answer - I haven't used PHY timestamping. It must be something specific to that, since I do know that MAC + software timestamping work simultaneously just fine.
On Mon, 20 Nov 2023 16:23:16 +0200 Vladimir Oltean <vladimir.oltean@nxp.com> wrote: > On Mon, Nov 20, 2023 at 02:49:29PM +0100, Köry Maincent wrote: > > > The next question would be: if a driver performs PHY management in > > > firmware, and does not use phylib, how should user space interact with it? > > > What timestamping layer and upon what should the ID be chosen? > > > > In that case it could be the second options I refereed to. > > Using the id to select the right timestamp within the NIC driver. > > It indeed won't be called PHY timestamping as it is managed by the NIC > > firmware but as it is managed by only one firmware and driver using the id > > to separate the available timestamp seems a good idea. > > > > Another solution would be to create another value in the layer enumeration. > > PHY_NIC_TIMESTAMPING? Better idea? I am not good at naming. > > The point I was trying to make is that your current choice of exposing > PHY_TIMESTAMPING in UAPI, when it really only refers to phylib PHYs, > would lead exactly to this sort of UAPI balkanization where everyone > wants to add more timestamping layers, and to define IDs to be specific > to their own invented layer. Maybe the concept of timestamping layers is > not what user space should see at all. > > In previous email discussions, I was proposing to Jakub and you "what if > we didn't let user space select a specific layer like PHY_TIMESTAMPING > or MAC_TIMESTAMPING at all, but just select a specific phc_index as the > provider of hardware timestamps"? > > The limitation we're trying to lift is that currently, there can be only > a single provider of hardware timestamps. We make that provider customizable. > There is already a good understanding from user space that, if "ethtool -T" > on an interface says there is no PHC, then there are going to be no > hardware timestamps. So I thought it would be much more intuitive if the > timestamping layer could be selected by the user merely by an unified > phc_index (provided by a phylib phy or firmware based driver or whatever), > and everything else would just be an implementation detail of the kernel. > No one should care that it's a phylib phy, and shouldn't use a different > procedure to identify its ID based on whether it's a phylib or firmware > PHY. > > It's a bit hard to align my expectation of what this series should offer > with yours. I think we're talking past each other, which unfortunately > makes me lose track and interest. I wish you could have answered my > earlier question about this alternative proposal. > https://lore.kernel.org/netdev/20231013170903.p3ycicebnfrsmoks@skbuf/ I did thought about it but I got stuck by the case of hardware timestamping without PHC. Richard explained the reason of its existence here: https://lore.kernel.org/netdev/ZS3MKWlnPqTe8gkq@hoboy.vegasvil.org/#t Maybe I got a bit stuck in my implementation and should investigate more your proposition and how to deal with this case. Do you have an idea on how to solve it? > > > Finally (and unrelated to the question above), why is > > > SOFTWARE_TIMESTAMPING even a layer exposed in the UAPI? My understanding > > > of this patch set is that it is meant to select the source of hardware > > > timestamps that are given to a socket. What gap in the UAPI does the > > > introduction of a SOFTWARE_TIMESTAMPING hwtstamping layer cover? > > > > As I explained to Jakub: > > The software timestamping comes from the MAC driver capabilities and I > > decided to separate software and MAC timestamping. > > Why? What was the problem? This confuses me because I don't understand > what is the problem that the solution is trying to address, and whether > the solution is orthogonal to all the other UAPI that exists for > software and hardware timestamping at the socket layer - which AFAIK can > happily coexist. > > > If we select PHY timestamping we can't use software timestamping and > > for an user, selecting the MAC as timestamping seems not logical to > > use software timestamping (I got confused myself when I first dig into > > it long time ago). Be able to select directly Software timestamping > > seems appropriate and won't bring any harm. What do you think? > > Hmm, can you please explain what is the reason why software timestamping > can't coexist with PHY timestamping? It is a genuine question to which I > don't have an answer - I haven't used PHY timestamping. It must be > something specific to that, since I do know that MAC + software > timestamping work simultaneously just fine. The software timestamp is managed through the MAC driver calling skb_tx_timestamp() function. The PHY driver does not call it, that's why there is no software timestamping in PHY driver capabilities. Also the PHY driver doesn't know if the MAC driver support it so it currently can not coexist with PHY timestamping. Regards,
On Mon, Nov 20, 2023 at 03:53:44PM +0100, Köry Maincent wrote: > I did thought about it but I got stuck by the case of hardware timestamping > without PHC. Richard explained the reason of its existence here: > https://lore.kernel.org/netdev/ZS3MKWlnPqTe8gkq@hoboy.vegasvil.org/#t > > Maybe I got a bit stuck in my implementation and should investigate more your > proposition and how to deal with this case. Do you have an idea on how to > solve it? I would take what Richard said with a grain of salt, and interpret as "there exists hardware with hwts but w/o PHC, and that may work for marginal use cases", not that "we should design having that as a first-class citizen in mind". If there is any need for me to point out that such kind of driver isn't a first class citizen, then here's an experiment: diff --git a/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c b/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c index e993ed04ab57..5755f54197b9 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c @@ -842,13 +842,7 @@ static int enetc_get_ts_info(struct net_device *ndev, { int *phc_idx; - phc_idx = symbol_get(enetc_phc_index); - if (phc_idx) { - info->phc_index = *phc_idx; - symbol_put(enetc_phc_index); - } else { - info->phc_index = -1; - } + info->phc_index = -1; #ifdef CONFIG_FSL_ENETC_PTP_CLOCK info->so_timestamping = SOF_TIMESTAMPING_TX_HARDWARE | When elected as master, it kinda works, and does synchronize with a slave, even if ptp4l gets confused about the phc_index being -1. But when elected as a slave by the BMCA, ptp4l gets confused and thinks that phc_index == -1 means that it's supposed to use software timestamping. $ ptp4l -i eno0 -2 -P -m -s ptp4l[1185.594]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[1185.598]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[1186.887]: port 1: new foreign master 00049f.fffe.05f628-1 ptp4l[1190.889]: selected best master clock 00049f.fffe.05f628 ptp4l[1190.890]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[1191.891]: master offset 37000003850 s0 freq +100000000 path delay 823 ptp4l[1192.896]: master offset 3[ 1235.693485] systemd-journald[125]: Time jumped backwards, rotating. 7000003848 s1 freq +99999998 path delay 822 ptp4l[1193.603]: clockcheck: clock jumped forward or running faster than expected! ptp4l[1193.892]: clockcheck: clock jumped forward or running faster than expected! ptp4l[1193.893]: master offset 37000003852 s0 freq +99999998 path delay 822 ptp4l[1194.342]: clockcheck: clock jumped forward or running faster than expected! ptp4l[1194.604]: clockcheck: clock jumped forward or running faster than expected! So, I guess the only thing we need to do to this kind of setup is not do too much harm to it. We break nothing if we make the phc_index the central aspect of hwts layer selection - except for the fact that such a MAC won't be able to change its timestamping layer to be a PHY. If we wanted to add such a capability to that MAC driver, the obvious way to solve the lack of a PHC is to create a PHC that returns -EOPNOTSUPP for all of its ptp_clock_info operations (gettime, settime etc). It may possibly be that, in the worst case, ptp4l needs to probe for each syscall on the NIC's PHC being operational before deciding what can be done with it. But that's already an improvement over the current handling to make it more graceful, it's not to keep things on par. > > Hmm, can you please explain what is the reason why software timestamping > > can't coexist with PHY timestamping? It is a genuine question to which I > > don't have an answer - I haven't used PHY timestamping. It must be > > something specific to that, since I do know that MAC + software > > timestamping work simultaneously just fine. > > The software timestamp is managed through the MAC driver calling > skb_tx_timestamp() function. The PHY driver does not call it, that's why there > is no software timestamping in PHY driver capabilities. Also the PHY driver > doesn't know if the MAC driver support it so it currently can not coexist with > PHY timestamping. I don't understand. Documentation/networking/timestamping.rst says that skb_tx_timestamp() is one of the actual _mechanisms_ through which phylib timestamping works. It's called by the parent MAC driver, and mii_ts->txtstamp() hooks onto that. So in some situations, the PHY timestamping core _piggybacks_ on top of software timestamping support in the MAC. Where I agree is that the PHY driver has no business in deciding whether the interface should report the socket option flags for software timestamping or not. But I still don't see a proof that they can't coexist. What you need to explain is what makes said software timestamping unusable in the presence of PHY timestamping - to justify this separate software timestamping layer in your UAPI proposal.
On Mon, 20 Nov 2023 18:10:04 +0200 Vladimir Oltean <vladimir.oltean@nxp.com> wrote: > On Mon, Nov 20, 2023 at 03:53:44PM +0100, Köry Maincent wrote: > > I did thought about it but I got stuck by the case of hardware timestamping > > without PHC. Richard explained the reason of its existence here: > > https://lore.kernel.org/netdev/ZS3MKWlnPqTe8gkq@hoboy.vegasvil.org/#t > > > > Maybe I got a bit stuck in my implementation and should investigate more > > your proposition and how to deal with this case. Do you have an idea on how > > to solve it? > > I would take what Richard said with a grain of salt, and interpret as > "there exists hardware with hwts but w/o PHC, and that may work for > marginal use cases", not that "we should design having that as a > first-class citizen in mind". > When elected as master, it kinda works, and does synchronize with a > slave, even if ptp4l gets confused about the phc_index being -1. > > But when elected as a slave by the BMCA, ptp4l gets confused and thinks > that phc_index == -1 means that it's supposed to use software timestamping. > > So, I guess the only thing we need to do to this kind of setup is not do > too much harm to it. > > We break nothing if we make the phc_index the central aspect of hwts > layer selection - except for the fact that such a MAC won't be able to > change its timestamping layer to be a PHY. > > If we wanted to add such a capability to that MAC driver, the obvious > way to solve the lack of a PHC is to create a PHC that returns > -EOPNOTSUPP for all of its ptp_clock_info operations (gettime, settime > etc). It may possibly be that, in the worst case, ptp4l needs to probe > for each syscall on the NIC's PHC being operational before deciding what > can be done with it. But that's already an improvement over the current > handling to make it more graceful, it's not to keep things on par. Ok, you convinced me. Thanks for your arguments and spending time explaining it. Regards,
On Mon, 20 Nov 2023 16:23:16 +0200 Vladimir Oltean wrote: > In previous email discussions, I was proposing to Jakub and you "what if > we didn't let user space select a specific layer like PHY_TIMESTAMPING > or MAC_TIMESTAMPING at all, but just select a specific phc_index as the > provider of hardware timestamps"? What about my use case of having a NIC which can stamp at a low rate at the PHY (for PTP) and high rate at the DMA block (for congestion control)? Both stamp points have the same PHC index.
> What about my use case of having a NIC which can stamp at a low rate > at the PHY (for PTP) and high rate at the DMA block (for congestion > control)? Both stamp points have the same PHC index. How theoretical is that? To me, it seems more likely you have two PHC. The PHY stamping tends to be slow because of the MDIO bus. If the MAC has fast access to the PHC, it means its not on the MDIO bus. It probably means you have the PHY integrated into the MAC/SoC, so the MAC can access it. But if its integrated like this, i don't see why PHY stamping should be particularly slow. So you can probably use it for congestion control. And then you don't need DMA stamping. Do you know of real hardware with a MAC and a PHY sharing a PHC? Andrew
On Mon, 20 Nov 2023 19:39:35 +0100 Andrew Lunn wrote: > > What about my use case of having a NIC which can stamp at a low rate > > at the PHY (for PTP) and high rate at the DMA block (for congestion > > control)? Both stamp points have the same PHC index. > > How theoretical is that? To me, it seems more likely you have two PHC. Very practical. mlx5 does this today, based on guessing and private ethtool flags. > The PHY stamping tends to be slow because of the MDIO bus. If the MAC > has fast access to the PHC, it means its not on the MDIO bus. It > probably means you have the PHY integrated into the MAC/SoC, so the > MAC can access it. But if its integrated like this, i don't see why > PHY stamping should be particularly slow. So you can probably use it > for congestion control. And then you don't need DMA stamping. Tx stamps are harder to carry back to the host all the way from the PHY than from the DMA block when DMA completion is signaled. Rx stamps seem much easier to carry down the pipeline but apparently some vendors are incapable of doing that as well. > Do you know of real hardware with a MAC and a PHY sharing a PHC? mlx5 for sure, but other designs, too. PHY, NIC pipeline and PCIe PTM may all need to time stamp from a single time counter.
On Mon, Nov 20, 2023 at 09:37:23AM -0800, Jakub Kicinski wrote: > On Mon, 20 Nov 2023 16:23:16 +0200 Vladimir Oltean wrote: > > In previous email discussions, I was proposing to Jakub and you "what if > > we didn't let user space select a specific layer like PHY_TIMESTAMPING > > or MAC_TIMESTAMPING at all, but just select a specific phc_index as the > > provider of hardware timestamps"? > > What about my use case of having a NIC which can stamp at a low rate > at the PHY (for PTP) and high rate at the DMA block (for congestion > control)? Both stamp points have the same PHC index. Well, first of all, given my understanding of the "laws of physics", I think something has to give in your use case description. I can't see how on RX, the NIC can decide in advance whether to provide low rate MAC timestamps for packets going to a socket and high rate DMA timestamps for packets going to another socket. It can either provide MAC timestamps, or DMA timestamps, or an unreliable, unpresentable to user space, mix. But maybe I'm wrong and there are NICs which can do that filtering. If such NIC exists, then I guess a SOF_TIMESTAMPING_RX_DMA flag should be added to the socket layer, and the NIC driver provides timestamps according to the skb->sk->sk_tsflags, and that problem is completely out of scope for Köry's patch set - and implicitly compatible with it, since as you say, the device-wide timestamping layer - PHC index - does not really change. If I'm not wrong and the MAC-or-DMA timestamp selection is NIC-wide (which diverges from your problem description), then neither Köry's work nor my "everything is a phc_index" proposal will bring your use case to fruition without further work. Here I would avoid speculating, because a lot will depend upon the details which you haven't really given. One question will be whether, in the case of "NIC-wide DMA timestamps", DMA timestamps should be presented as hardware timestamps - struct scm_timestamping[2] from CMSG_DATA() - or as their own thing, that user space needs explicit support for - by parsing a new cmsg level/type. If DMA timestamps won't look to user space like hardware timestamps, then the use case is again out of scope for Köry's work, as far as I see it. Another simple question is - if NICs do this today - probably by giving the "unrepresentable mix" to user space in an implicit, hardcoded and very fine tuned way such that nobody bats an eye - then what is there more to support? Are you looking at extra UAPI as a way to legitimize hacks, or do you feel there is extra control that applications can gain?
On Mon, Nov 20, 2023 at 07:39:35PM +0100, Andrew Lunn wrote: > > What about my use case of having a NIC which can stamp at a low rate > > at the PHY (for PTP) and high rate at the DMA block (for congestion > > control)? Both stamp points have the same PHC index. > > How theoretical is that? To me, it seems more likely you have two PHC. > > The PHY stamping tends to be slow because of the MDIO bus. If the MAC > has fast access to the PHC, it means its not on the MDIO bus. It > probably means you have the PHY integrated into the MAC/SoC, so the > MAC can access it. But if its integrated like this, i don't see why > PHY stamping should be particularly slow. So you can probably use it > for congestion control. And then you don't need DMA stamping. > > Do you know of real hardware with a MAC and a PHY sharing a PHC? Not so much a MAC and PHY sharing a PHC, but I notice in USXGMII-M, there is the ability for the PHY to pass the timestamp back to the MAC through the USXGMII-M connection - which would eliminate the problems of accessing the PHY to get that the timestamps.
On Mon, 20 Nov 2023 21:00:23 +0200 Vladimir Oltean wrote: > Well, first of all, given my understanding of the "laws of physics", > I think something has to give in your use case description. I can't > see how on RX, the NIC can decide in advance whether to provide low > rate MAC timestamps for packets going to a socket and high rate DMA > timestamps for packets going to another socket. It can either provide > MAC timestamps, or DMA timestamps, or an unreliable, unpresentable to > user space, mix. Rx time stamping is configured by filters. Is there a problem with user specifying that they want "true" timestamps for PTP/NTP packets, and "dma" timestamps for all the rest? Maybe we can extend struct scm_timestamping to carry an indication which stamp ended up in ts[2] but that's less important to me than the ability to configure the thing. Right now, as I said, mlx5 uses an ethtool priv flag :( > But maybe I'm wrong and there are NICs which can do that filtering. > If such NIC exists, then I guess a SOF_TIMESTAMPING_RX_DMA flag should > be added to the socket layer, and the NIC driver provides timestamps > according to the skb->sk->sk_tsflags, and that problem is completely out > of scope for Köry's patch set - and implicitly compatible with it, since > as you say, the device-wide timestamping layer - PHC index - does not > really change. IDK. Maybe the sniffles I picked up at LPC are clouding my judgment but to me this patch set is shaped too much by current implementation and not enough by what it's modeling. It basically exposes to user space the "mux" for choosing NETDEV vs PHYLIB. There are multiple time stamping points as the packet moves thru the pipeline. Expose them so that SIOC[GS]HWTSTAMP can target each on individually. > If I'm not wrong and the MAC-or-DMA timestamp selection is NIC-wide > (which diverges from your problem description), Nope. > then neither Köry's work > nor my "everything is a phc_index" proposal will bring your use case to > fruition without further work. Here I would avoid speculating, because a > lot will depend upon the details which you haven't really given. What are the details you'd like? PTP gets stamped at the PHY/MAC, the rest gets stamped at DMA. mlx5 achieves this by splitting the PTP traffic to a separate queue pair, and configuring that qp to capture PHY/MAC stamps, AFAIU. > One question will be whether, in the case of "NIC-wide DMA timestamps", > DMA timestamps should be presented as hardware timestamps - struct > scm_timestamping[2] from CMSG_DATA() - or as their own thing, that user > space needs explicit support for - by parsing a new cmsg level/type. > If DMA timestamps won't look to user space like hardware timestamps, > then the use case is again out of scope for Köry's work, as far as I see > it. > > Another simple question is - if NICs do this today - probably by giving > the "unrepresentable mix" to user space in an implicit, hardcoded and > very fine tuned way such that nobody bats an eye - then what is there > more to support? Are you looking at extra UAPI as a way to legitimize > hacks, or do you feel there is extra control that applications can gain? I don't understand what you're asking me. DMA timestamping is becoming increasingly important. Ready any congestion control paper from the last 5 years and chances are it will be using delay as a signal. If we're extending uAPI for Hw stamping we should make sure to cater to CC use cases.
On Mon, Nov 20, 2023 at 10:51:48AM -0800, Jakub Kicinski wrote: > On Mon, 20 Nov 2023 19:39:35 +0100 Andrew Lunn wrote: > > Do you know of real hardware with a MAC and a PHY sharing a PHC? > > mlx5 for sure, but other designs, too. PHY, NIC pipeline and PCIe PTM > may all need to time stamp from a single time counter. I'm still waiting for you to fully clarify the "per socket vs global" aspect, but independently of that, at least I understand why this is a counter-argument to my proposal. I need to tune it a bit (ASSUMING that we want DMA timestamps to "look like" hwtimestamps, and not like their own thing, to user space), because the PHC index would no longer fully identify a hwtstamp provider, so we need something more. I imagine both ETHTOOL_MSG_TSINFO_GET and ETHTOOL_MSG_TSINFO_SET to support a new (nest) nlattr called ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER. This would contain (u32) ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER_PHC_INDEX and (u32) ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER_QUALIFIER. It could be extensible in the future, but this is the baseline and forms the key. The latter takes values from an: enum ethtool_hwstamp_provider_qualifier { ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_MAC, ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_PHY, ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_DMA, }; with Jakub's comments about the various types providing various qualities of timestamps, given here: https://lore.kernel.org/netdev/20230511150902.57d9a437@kernel.org/ - PHY - per spec, at the RS layer - MAC - "close to the wire" in the MAC, specifically the pipeline delay (PHY stamp vs MAC stamp) should be constant for all packets; there must be no variable-time buffering and (for Tx) the time stamping must be past the stage of the pipeline affected by pause frames - DMA - worst quality, variable delay timestamp, usually taken when packets DMA descriptors (Rx or completion) are being written It _sounds_ like we've all been talking about the same thing for ages, but we weren't. So, a PHC could offer multiple hwtstamp providers, as many as there are qualifiers to uniquely describe them. Each hwstamp provider is represented by a single ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER nested nlattr. In TSINFO_GET requests, there are as many ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER nests as there are hwtstamp providers for the NIC. In the "normal" case of one single hwtstamp provider per PHC, it would be the responsibility of the driver to set its qualifier to the right thing: phylib to ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_PHY, and "normal" MAC drivers to ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_MAC. Here, the qualifier isn't more than an extra (partially redundant) mechanism for user space to know what it's juggling with. As opposed to Köry's proposal (where "dev->ts_layer == PHY_TIMESTAMPING" means actual phylib), ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_PHY describes the PHY-like timestamp quality of any hwtstamp provider, be it provided by a phylib PHY or a firmware-based PHY. It doesn't describe "the layer" itself. Does this tick all boxes?
On Mon, Nov 20, 2023 at 11:58:39AM -0800, Jakub Kicinski wrote: > Rx time stamping is configured by filters. Is there a problem with user > specifying that they want "true" timestamps for PTP/NTP packets, and > "dma" timestamps for all the rest? There is, because enum hwtstamp_rx_filters is NIC-wide, and there is only one of those - corresponding to the single hwtstamp (ts[2]) provider. There were never talks in this patch set, AFAIR, about multiple hwtstamp providers active simultaneously (for different traffic streams) and configuring them independently, with separate RX filters. > Maybe we can extend struct scm_timestamping to carry an indication > which stamp ended up in ts[2] but that's less important to me than > the ability to configure the thing. Right now, as I said, mlx5 uses > an ethtool priv flag :( No, you misunderstood me. I didn't suggest (at least not here) to add an indication to struct scm_timestamping of "what's the source of ts[2] (the hwtstamp)". I was just _asking_ (collecting data) whether it's ultimately desirable for DMA timestamps to be visible in ts[2] (indistinguishable from a better hwtstamp, as they currently are, I guess) rather than their own thing. Like for example, in a congestion control algorithm, where does TCP really get them from. If they'd rather be their own thing in a fully developed API, then the whole discussion is rather off-topic to Köry, because here, we're beating the dead horse of "where does ts[2] come from" - _still_ a single source, just selectable, that is. > > But maybe I'm wrong and there are NICs which can do that filtering. > > If such NIC exists, then I guess a SOF_TIMESTAMPING_RX_DMA flag should > > be added to the socket layer, and the NIC driver provides timestamps > > according to the skb->sk->sk_tsflags, and that problem is completely out > > of scope for Köry's patch set - and implicitly compatible with it, since > > as you say, the device-wide timestamping layer - PHC index - does not > > really change. > > IDK. Maybe the sniffles I picked up at LPC are clouding my judgment > but to me this patch set is shaped too much by current implementation > and not enough by what it's modeling. It basically exposes to user > space the "mux" for choosing NETDEV vs PHYLIB. The last sentence I agree with. > There are multiple time stamping points as the packet moves thru > the pipeline. Expose them so that SIOC[GS]HWTSTAMP can target each > on individually. Ok, that is rather vague and complex. You will forever have to contend with the fact that struct scm_timestamping can contain a single hwtstamp per packet: ts[2]. So you need complex control path logic to ensure that the sum of RX filters for all timestamping points in the packet data path doesn't actually request more than one ts[2] for any skb. I understand how that could scrath your itch, but here, it sounds off-topic? This is actually starting to get close, in a way, to my feedback to Richard to allow multiple hwtstamp sources for an skb, and to just give an indication in the cmsg of what's their source, leaving user space to figure out the rest. https://lore.kernel.org/netdev/20220120164832.xdebp5vykib6h6dp@skbuf/ But his response was "There was a fair amount of discussion, and it seemed to me that everyone wanted a pony." https://lore.kernel.org/netdev/Y%2F0Idkhy27TObawi@hoboy.vegasvil.org/ I mean, IDK, maybe it's not off-topic, but it's a round-about way of achieving what they think they can achieve in a more straightforward way. Rephrased in my own words, you're saying: Forget the concept of an active hwtstamp provider, just open up the knobs of _all_ possible hwtstamp providers for a NIC. Simultaneously! To make one active and all the others inactive, just use HWTSTAMP_FILTER_NONE/HWTSTAMP_TX_OFF for all except one, and the desired enum hwtstamp_rx_filters / enum hwtstamp_tx_types for the active one. Live with this expanded configuration model for a while, just restricted for a single active timestamping layer, and then, once user space is ready for an enhanced struct scm_timestamping which supports potentially multiple cmsgs with distinct hwtstamps, remove the restriction and let it all rip! Everybody gets their pony! Additionally, SIOCSHWTSTAMP is kinda rusty, has a fixed binary format, and is not extensible to target a specific hwtstamp provider. So a netlink conversion of that, as a first step, would of course be great. Is it an accurate summary? If it is, I'll let Köry comment on the feasibility :) > > If I'm not wrong and the MAC-or-DMA timestamp selection is NIC-wide > > (which diverges from your problem description), > > Nope. > > > then neither Köry's work > > nor my "everything is a phc_index" proposal will bring your use case to > > fruition without further work. Here I would avoid speculating, because a > > lot will depend upon the details which you haven't really given. > > What are the details you'd like? PTP gets stamped at the PHY/MAC, > the rest gets stamped at DMA. mlx5 achieves this by splitting the > PTP traffic to a separate queue pair, and configuring that qp to > capture PHY/MAC stamps, AFAIU. That's enough, thanks. > > One question will be whether, in the case of "NIC-wide DMA timestamps", > > DMA timestamps should be presented as hardware timestamps - struct > > scm_timestamping[2] from CMSG_DATA() - or as their own thing, that user > > space needs explicit support for - by parsing a new cmsg level/type. > > If DMA timestamps won't look to user space like hardware timestamps, > > then the use case is again out of scope for Köry's work, as far as I see > > it. > > > > Another simple question is - if NICs do this today - probably by giving > > the "unrepresentable mix" to user space in an implicit, hardcoded and > > very fine tuned way such that nobody bats an eye - then what is there > > more to support? Are you looking at extra UAPI as a way to legitimize > > hacks, or do you feel there is extra control that applications can gain? > > I don't understand what you're asking me. You've partially answered above. The mix of timestamps coming from the PHY/MAC and those coming from the DMA is unrepresentable in today's UAPI, and is just fine-tuned to work for the existing use case of "PTP gets PHY/MAC, everything else gets DMA". Still not 100% clear what would the proper UAPI (separate user-controllable RX filters for PHY, MAC and DMA) gain, in addition to what exists in mlx5. > DMA timestamping is becoming increasingly important. Ready any > congestion control paper from the last 5 years and chances are > it will be using delay as a signal. If we're extending uAPI > for Hw stamping we should make sure to cater to CC use cases. I'll stop commenting here for a while and go read some of those papers and RFCs, in the hope that I find some info about the way in which TCP expects hwtstamps from a NIC. It's quite evident to me that I don't have enough information to help this discussion reach a conclusion.
On Mon, 20 Nov 2023 23:17:59 +0200 Vladimir Oltean wrote: > Forget the concept of an active hwtstamp provider, just open up the > knobs of _all_ possible hwtstamp providers for a NIC. Simultaneously! > To make one active and all the others inactive, just use > HWTSTAMP_FILTER_NONE/HWTSTAMP_TX_OFF for all except one, and the desired > enum hwtstamp_rx_filters / enum hwtstamp_tx_types for the active one. > Live with this expanded configuration model for a while, just restricted > for a single active timestamping layer, and then, once user space is > ready for an enhanced struct scm_timestamping which supports potentially > multiple cmsgs with distinct hwtstamps, remove the restriction and let > it all rip! Everybody gets their pony! > > Additionally, SIOCSHWTSTAMP is kinda rusty, has a fixed binary format, > and is not extensible to target a specific hwtstamp provider. So a > netlink conversion of that, as a first step, would of course be great. > > Is it an accurate summary? Yes. For now we can impose the requirement that only one can be active easily at the kernel level. But the uAPI should allow expressing more. > You've partially answered above. The mix of timestamps coming from the > PHY/MAC and those coming from the DMA is unrepresentable in today's > UAPI, and is just fine-tuned to work for the existing use case of "PTP > gets PHY/MAC, everything else gets DMA". > > Still not 100% clear what would the proper UAPI (separate user-controllable > RX filters for PHY, MAC and DMA) gain, in addition to what exists in mlx5. Too late for mlx5 but I'm anticipating that more vendors will start needing such configuration in the future. At which point it will be good to have an API in place.
On Mon, 20 Nov 2023 21:58:58 +0200 Vladimir Oltean wrote: > I'm still waiting for you to fully clarify the "per socket vs global" > aspect, but independently of that, at least I understand why this is a > counter-argument to my proposal. I need to tune it a bit (ASSUMING that > we want DMA timestamps to "look like" hwtimestamps, and not like their > own thing, to user space), because the PHC index would no longer fully > identify a hwtstamp provider, so we need something more. > > I imagine both ETHTOOL_MSG_TSINFO_GET and ETHTOOL_MSG_TSINFO_SET to > support a new (nest) nlattr called ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER. > > This would contain (u32) ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER_PHC_INDEX > and (u32) ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER_QUALIFIER. It could be > extensible in the future, but this is the baseline and forms the key. > > The latter takes values from an: > > enum ethtool_hwstamp_provider_qualifier { > ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_MAC, > ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_PHY, > ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_DMA, > }; Sounds reasonable. Having more attributes than just PHC index works. Given the lack of distinction between MAC and PHY for integrated NICs I'd lean towards ditching the "layers" completely and exposing an "approximate" vs "precise" boolean. Approximate being the DMA point for NICs, but more generically a point that is separated from the wire by buffering or other variable length delay. Precise == IEEE 1588 quality.
On Mon, Nov 20, 2023 at 01:37:37PM -0800, Jakub Kicinski wrote: > > Is it an accurate summary? > > Yes. > > For now we can impose the requirement that only one can be active > easily at the kernel level. But the uAPI should allow expressing more. I see. That's quite something to think about for Köry. In its defense, I also agree that this idea seems the most orthogonal to everything else that we have or may want to add in the future, and is not likely to become obsoleted by some other mechanism that can achieve the same thing, but in a more flexible way. It's just that it's quite the task. I sense it may be time to dust off and submit the rest of my ndo_hwtstamp_get()/ ndo_hwtstamp_set() conversions before a netlink conversion of SIOCGHWTSTAMP/SIOCSHWTSTAMP could even take place... https://github.com/vladimiroltean/linux/commits/ndo-hwtstamp-v9
On Tue, 21 Nov 2023 00:05:49 +0200 Vladimir Oltean <vladimir.oltean@nxp.com> wrote: > On Mon, Nov 20, 2023 at 01:37:37PM -0800, Jakub Kicinski wrote: > > > Is it an accurate summary? > > > > Yes. > > > > For now we can impose the requirement that only one can be active > > easily at the kernel level. But the uAPI should allow expressing more. > > I see. That's quite something to think about for Köry. In its defense, > I also agree that this idea seems the most orthogonal to everything else > that we have or may want to add in the future, and is not likely to > become obsoleted by some other mechanism that can achieve the same > thing, but in a more flexible way. It's just that it's quite the task. > > I sense it may be time to dust off and submit the rest of my > ndo_hwtstamp_get()/ ndo_hwtstamp_set() conversions before a netlink > conversion of SIOCGHWTSTAMP/SIOCSHWTSTAMP could even take place... > https://github.com/vladimiroltean/linux/commits/ndo-hwtstamp-v9 Ok I kind of got an idea of what is your prerequisites. If I summarize, a solution could be this: - Expand struct hwtstamp_config with a phc_index member for the SIOCG/SHWTSTAMP commands. To keep backward compatibility if phc_index is not set in the hwtstamp_config data from userspace use the default hwtstamp (the default being selected as done in my patch series). Is this possible, would it breaks things? - In netlink part, send one netlink tsinfo skb for each phc_index. Could be done in a later patch series: - Expand netlink TSINFO with ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER_QUALIFIER. Describing this struct: enum ethtool_hwstamp_provider_qualifier { ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_PRECISE, ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_APPROX, }; Set the desired qualifier through TSINFO_SET or through SIOCSHWTSTAMP by expanding again the struct hwtstamp_config. Do you think this is feasible? I might miss some core stuff. Regards,
On Tue, 21 Nov 2023 18:31:14 +0100 Köry Maincent wrote: > - Expand struct hwtstamp_config with a phc_index member for the SIOCG/SHWTSTAMP > commands. > To keep backward compatibility if phc_index is not set in the hwtstamp_config > data from userspace use the default hwtstamp (the default being selected as > done in my patch series). > Is this possible, would it breaks things? I'd skip this bit, and focus on the ETHTOOL_TSINFO. Keep the ioctl as "legacy" and do all the extensions in ethtool. TSINFO_GET can serve as GET, to avoid adding 3rd command for the same thing. TSINFO_SET would be new (as you indicate below). > - In netlink part, send one netlink tsinfo skb for each phc_index. phc_index and netdev combination. A DO command can only generate one answer (or rather, it should generate only one answer, there are few hard rules in netlink). So we need to move that functionality to DUMP. We can filter the DUMP based on user-provided ifindex and/or phc_index. > Could be done in a later patch series: > - Expand netlink TSINFO with ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER_QUALIFIER. > Describing this struct: > enum ethtool_hwstamp_provider_qualifier { > ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_PRECISE, > ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_APPROX, > }; > > Set the desired qualifier through TSINFO_SET or through SIOCSHWTSTAMP by > expanding again the struct hwtstamp_config. > > Do you think this is feasible?
On Tue, Nov 21, 2023 at 06:31:14PM +0100, Köry Maincent wrote: > If I summarize, a solution could be this: > > - Expand struct hwtstamp_config with a phc_index member for the SIOCG/SHWTSTAMP > commands. > To keep backward compatibility if phc_index is not set in the hwtstamp_config > data from userspace use the default hwtstamp (the default being selected as > done in my patch series). > Is this possible, would it breaks things? You can't "expand" hwtstamp_config because it is an established ABI. (You could introduce SIOC[GS]HWTSTAMP_2 with an expanded layout) Thanks, Richard
On Tue, 21 Nov 2023 09:43:54 -0800 Jakub Kicinski <kuba@kernel.org> wrote: > On Tue, 21 Nov 2023 18:31:14 +0100 Köry Maincent wrote: > > - Expand struct hwtstamp_config with a phc_index member for the > > SIOCG/SHWTSTAMP commands. > > To keep backward compatibility if phc_index is not set in the > > hwtstamp_config data from userspace use the default hwtstamp (the default > > being selected as done in my patch series). > > Is this possible, would it breaks things? > > I'd skip this bit, and focus on the ETHTOOL_TSINFO. Keep the ioctl as > "legacy" and do all the extensions in ethtool. TSINFO_GET can serve > as GET, to avoid adding 3rd command for the same thing. TSINFO_SET > would be new (as you indicate below). You say this patch series should simply add TSINFO_SET command to set the current phc_index? It won't solve your requirement of having simultaneous hwtimestamp and enabling/disabling them through rx_filter and tx_types. You want to do this in another patch series alongside a new SIOCG/SHWTSTAMP_2 ABI? > > - In netlink part, send one netlink tsinfo skb for each phc_index. > > phc_index and netdev combination. A DO command can only generate one > answer (or rather, it should generate only one answer, there are few > hard rules in netlink). So we need to move that functionality to DUMP. > We can filter the DUMP based on user-provided ifindex and/or phc_index. Currently, the dumpit function is assigned to ethnl_default_dumpit. Wouldn't the behavior change of the dumpit callback break the ABI? > > Could be done in a later patch series: > > - Expand netlink TSINFO with ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER_QUALIFIER. > > Describing this struct: > > enum ethtool_hwstamp_provider_qualifier { > > ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_PRECISE, > > ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_APPROX, > > }; > > > > Set the desired qualifier through TSINFO_SET or through SIOCSHWTSTAMP by > > expanding again the struct hwtstamp_config. Just wondering to have a insight of future support, in the case of several provider qualifier and the SIOCG/SHWTSTAMP_2 layout containing the phc_index. Will we be able to talk to the two providers qualifiers simultaneously or is it not possible. To know if the SIOCG/SHWTSTAMP_2 layout would contain the description of the qualifier provider. If I understand well your mail in the thread it will be the case right? Regards,
On Wed, Nov 22, 2023 at 02:44:53PM +0100, Köry Maincent wrote: > On Tue, 21 Nov 2023 09:43:54 -0800 > Jakub Kicinski <kuba@kernel.org> wrote: > > > On Tue, 21 Nov 2023 18:31:14 +0100 Köry Maincent wrote: > > > - Expand struct hwtstamp_config with a phc_index member for the > > > SIOCG/SHWTSTAMP commands. > > > To keep backward compatibility if phc_index is not set in the > > > hwtstamp_config data from userspace use the default hwtstamp (the default > > > being selected as done in my patch series). > > > Is this possible, would it breaks things? > > > > I'd skip this bit, and focus on the ETHTOOL_TSINFO. Keep the ioctl as > > "legacy" and do all the extensions in ethtool. TSINFO_GET can serve > > as GET, to avoid adding 3rd command for the same thing. TSINFO_SET > > would be new (as you indicate below). > > You say this patch series should simply add TSINFO_SET command to set the > current phc_index? > > It won't solve your requirement of having simultaneous hwtimestamp and > enabling/disabling them through rx_filter and tx_types. > You want to do this in another patch series alongside a new SIOCG/SHWTSTAMP_2 > ABI? > > > > - In netlink part, send one netlink tsinfo skb for each phc_index. > > > > phc_index and netdev combination. A DO command can only generate one > > answer (or rather, it should generate only one answer, there are few > > hard rules in netlink). So we need to move that functionality to DUMP. > > We can filter the DUMP based on user-provided ifindex and/or phc_index. > > Currently, the dumpit function is assigned to ethnl_default_dumpit. Wouldn't > the behavior change of the dumpit callback break the ABI? > > > > > Could be done in a later patch series: > > > - Expand netlink TSINFO with ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER_QUALIFIER. > > > Describing this struct: > > > enum ethtool_hwstamp_provider_qualifier { > > > ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_PRECISE, > > > ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_APPROX, > > > }; > > > > > > Set the desired qualifier through TSINFO_SET or through SIOCSHWTSTAMP by > > > expanding again the struct hwtstamp_config. > > Just wondering to have a insight of future support, in the case of several > provider qualifier and the SIOCG/SHWTSTAMP_2 layout containing the phc_index. > Will we be able to talk to the two providers qualifiers simultaneously or is it > not possible. To know if the SIOCG/SHWTSTAMP_2 layout would contain the > description of the qualifier provider. > If I understand well your mail in the thread it will be the case right? > > Regards, > -- > Köry Maincent, Bootlin > Embedded Linux and kernel engineering > https://bootlin.com/ My understanding of Jakub's email was that he wants to see the functionality offered by SIOCGHWTSTAMP and SIOCSHWTSTAMP converted to netlink. I don't think that ethtool is the correct netlink family for that, given that these aren't ethtool ioctls to begin with. Maybe the new netdev netlink family. The conversion in its basic form would offer exactly the same functionality. The extended netlink messages would have extra attributes to identify the targeted hwtstamp provider. In the lack of those attributes, the default hwtstamp provider is targeted. The definition of the default hwtstamp provider should be as per your current patch set (netdev, with a whitelist for current phylib PHYs). The _listing_ of hwtstamp providers is what could be done through ethtool netlink, similar but not identical to the way in which you are proposing today (you are presenting blanket "layers" which correspond to netdev and phylib, rather than individual providers). The concept of an "active phc_index" would not explicitly exist in the UAPI. Thus I'm not sure what's with this TSINFO_SET being floated around. The only thing would exist is a configurable rx_filter and tx_type per hwtstamp provider (aka "{phc_index, qualifier}"). User space will have to learn to select the hwtstamp provider it wants to configure through netlink, and use for its class of traffic. This is why I mentioned by ndo_hwtstamp_set() conversion, because suddenly it is a prerequisite for any further progress to be done. You can't convert SIOCSHWTSTAMP to netlink if there are some driver implementations which still use ndo_eth_ioctl(). They need to be UAPI-agnostic. I'm not sure what's with Richard's mention of the "_2" variants of the ioctls. Probably a low-effort suggestion which was a bit out of context. His main point, that you cannot extend struct hwtstamp_config as that has a fixed binary format, is perfectly valid though. This is why netlink is preferable, because if done correctly (meaning not with NLA_BINARY attributes), then it is much more extensible because all attributes are TLVs. Use NLA_BINARY, and you will run into the exact extensibility issues that the ioctl interface has.
On Wed, Nov 22, 2023 at 04:08:50PM +0200, Vladimir Oltean wrote: > The concept of an "active phc_index" would not explicitly exist in the > UAPI. Thus I'm not sure what's with this TSINFO_SET being floated around. > The only thing would exist is a configurable rx_filter and tx_type per > hwtstamp provider (aka "{phc_index, qualifier}"). User space will have > to learn to select the hwtstamp provider it wants to configure through > netlink, and use for its class of traffic. One clarification: the extended timestamping filters are PER NETDEV (in addition to being per one of the hwtstamp providers listed for that netdev). This was understated from the fact that the netlink interface itself targets a netdev, but I didn't say it explicitly.
On Wed, Nov 22, 2023 at 04:08:50PM +0200, Vladimir Oltean wrote: > The concept of an "active phc_index" would not explicitly exist in the > UAPI. Thus I'm not sure what's with this TSINFO_SET being floated around. > The only thing would exist is a configurable rx_filter and tx_type per > hwtstamp provider (aka "{phc_index, qualifier}"). User space will have > to learn to select the hwtstamp provider it wants to configure through > netlink, and use for its class of traffic. @Jakub, for your long-term "MAC timestamps for PTP, DMA for everything else". How do you see this? I guess we need some sort of priority function in the UAPI between hwtstamp providers. And even with that, I think the enums that we currently have for filters are not specific enough. The most we could expose is: MAC provider DMA provider hwtstamp_rx_filters HWTSTAMP_FILTER_PTP_V2_EVENT HWTSTAMP_FILTER_ALL tx_type HWTSTAMP_TX_ON HWTSTAMP_TX_ON but it isn't clear: for PTP, does the DMA provider give you an RX timestamp too? What about a TX timestamp?
On Wed, 22 Nov 2023 16:08:50 +0200 Vladimir Oltean <vladimir.oltean@nxp.com> wrote: > On Wed, Nov 22, 2023 at 02:44:53PM +0100, Köry Maincent wrote: > > On Tue, 21 Nov 2023 09:43:54 -0800 > > Jakub Kicinski <kuba@kernel.org> wrote: > > > > > On Tue, 21 Nov 2023 18:31:14 +0100 Köry Maincent wrote: > > > > - Expand struct hwtstamp_config with a phc_index member for the > > > > SIOCG/SHWTSTAMP commands. > > > > To keep backward compatibility if phc_index is not set in the > > > > hwtstamp_config data from userspace use the default hwtstamp (the > > > > default being selected as done in my patch series). > > > > Is this possible, would it breaks things? > > > > > > I'd skip this bit, and focus on the ETHTOOL_TSINFO. Keep the ioctl as > > > "legacy" and do all the extensions in ethtool. TSINFO_GET can serve > > > as GET, to avoid adding 3rd command for the same thing. TSINFO_SET > > > would be new (as you indicate below). > > > > You say this patch series should simply add TSINFO_SET command to set the > > current phc_index? > > > > It won't solve your requirement of having simultaneous hwtimestamp and > > enabling/disabling them through rx_filter and tx_types. > > You want to do this in another patch series alongside a new > > SIOCG/SHWTSTAMP_2 ABI? > > > > > > - In netlink part, send one netlink tsinfo skb for each phc_index. > > > > > > phc_index and netdev combination. A DO command can only generate one > > > answer (or rather, it should generate only one answer, there are few > > > hard rules in netlink). So we need to move that functionality to DUMP. > > > We can filter the DUMP based on user-provided ifindex and/or phc_index. > > > > Currently, the dumpit function is assigned to ethnl_default_dumpit. Wouldn't > > the behavior change of the dumpit callback break the ABI? > > > > > > > > Could be done in a later patch series: > > > > - Expand netlink TSINFO with > > > > ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER_QUALIFIER. Describing this struct: > > > > enum ethtool_hwstamp_provider_qualifier { > > > > ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_PRECISE, > > > > ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_APPROX, > > > > }; > > > > > > > > Set the desired qualifier through TSINFO_SET or through SIOCSHWTSTAMP > > > > by expanding again the struct hwtstamp_config. > > > > Just wondering to have a insight of future support, in the case of several > > provider qualifier and the SIOCG/SHWTSTAMP_2 layout containing the > > phc_index. Will we be able to talk to the two providers qualifiers > > simultaneously or is it not possible. To know if the SIOCG/SHWTSTAMP_2 > > layout would contain the description of the qualifier provider. > > If I understand well your mail in the thread it will be the case right? > My understanding of Jakub's email was that he wants to see the functionality > offered by SIOCGHWTSTAMP and SIOCSHWTSTAMP converted to netlink. I don't > think that ethtool is the correct netlink family for that, given that > these aren't ethtool ioctls to begin with. Maybe the new netdev netlink > family. The conversion in its basic form would offer exactly the same > functionality. The extended netlink messages would have extra attributes > to identify the targeted hwtstamp provider. In the lack of those > attributes, the default hwtstamp provider is targeted. The definition of > the default hwtstamp provider should be as per your current patch set > (netdev, with a whitelist for current phylib PHYs). > > The _listing_ of hwtstamp providers is what could be done through ethtool > netlink, similar but not identical to the way in which you are proposing > today (you are presenting blanket "layers" which correspond to netdev and > phylib, rather than individual providers). > > The concept of an "active phc_index" would not explicitly exist in the > UAPI. Thus I'm not sure what's with this TSINFO_SET being floated around. > The only thing would exist is a configurable rx_filter and tx_type per > hwtstamp provider (aka "{phc_index, qualifier}"). User space will have > to learn to select the hwtstamp provider it wants to configure through > netlink, and use for its class of traffic. > > This is why I mentioned by ndo_hwtstamp_set() conversion, because > suddenly it is a prerequisite for any further progress to be done. > You can't convert SIOCSHWTSTAMP to netlink if there are some driver > implementations which still use ndo_eth_ioctl(). They need to be > UAPI-agnostic. > > I'm not sure what's with Richard's mention of the "_2" variants of the > ioctls. Probably a low-effort suggestion which was a bit out of context. > His main point, that you cannot extend struct hwtstamp_config as that > has a fixed binary format, is perfectly valid though. This is why > netlink is preferable, because if done correctly (meaning not with > NLA_BINARY attributes), then it is much more extensible because all > attributes are TLVs. Use NLA_BINARY, and you will run into the exact > extensibility issues that the ioctl interface has. I appreciate the clarification, it now makes more sense to me. Regards,
On Wed, 22 Nov 2023 16:08:50 +0200 Vladimir Oltean wrote: > My understanding of Jakub's email was that he wants to see the functionality > offered by SIOCGHWTSTAMP and SIOCSHWTSTAMP converted to netlink. I don't > think that ethtool is the correct netlink family for that, given that > these aren't ethtool ioctls to begin with. Maybe the new netdev netlink > family. The conversion in its basic form would offer exactly the same > functionality. Well, ethtool has been the catch all for a lot of random things for the longest time. The question is whether we want to extend ETHTOOL_GET_TS_INFO or add a third API somewhere else. And if we do - do we also duplicate the functionality of ETHTOOL_GET_TS_INFO (i.e. getting capabilities)? My vote is that keeping it in ethtool is less bad than 3rd API. > The _listing_ of hwtstamp providers is what could be done through ethtool > netlink, similar but not identical to the way in which you are proposing > today (you are presenting blanket "layers" which correspond to netdev and > phylib, rather than individual providers). > > The concept of an "active phc_index" would not explicitly exist in the > UAPI. Thus I'm not sure what's with this TSINFO_SET being floated around. > The only thing would exist is a configurable rx_filter and tx_type per > hwtstamp provider (aka "{phc_index, qualifier}"). User space will have > to learn to select the hwtstamp provider it wants to configure through > netlink, and use for its class of traffic. "Active provider" is the one that has TX_ON, rx != FILTER_NONE, right? > This is why I mentioned by ndo_hwtstamp_set() conversion, because > suddenly it is a prerequisite for any further progress to be done. > You can't convert SIOCSHWTSTAMP to netlink if there are some driver > implementations which still use ndo_eth_ioctl(). They need to be > UAPI-agnostic. Right, definitely. > I'm not sure what's with Richard's mention of the "_2" variants of the > ioctls. Probably a low-effort suggestion which was a bit out of context. > His main point, that you cannot extend struct hwtstamp_config as that > has a fixed binary format, is perfectly valid though. This is why > netlink is preferable, because if done correctly (meaning not with > NLA_BINARY attributes), then it is much more extensible because all > attributes are TLVs. Use NLA_BINARY, and you will run into the exact > extensibility issues that the ioctl interface has.
On Wed, 22 Nov 2023 16:36:18 +0200 Vladimir Oltean wrote: > @Jakub, for your long-term "MAC timestamps for PTP, DMA for everything else". > How do you see this? I guess we need some sort of priority function in > the UAPI between hwtstamp providers. > > And even with that, I think the enums that we currently have for filters > are not specific enough. The most we could expose is: > > MAC provider DMA provider > > hwtstamp_rx_filters HWTSTAMP_FILTER_PTP_V2_EVENT HWTSTAMP_FILTER_ALL > tx_type HWTSTAMP_TX_ON HWTSTAMP_TX_ON > > but it isn't clear: for PTP, does the DMA provider give you an RX > timestamp too? If we phrase it as "precise / approximate" rather than "MAC / DMA" - it seems fairly intuitive to give the best timestamp available for a given packet, no? > What about a TX timestamp? I was thinking - socket flag to make packets for a given socket request precise timestamps.
On Wed, Nov 22, 2023 at 08:50:00AM -0800, Jakub Kicinski wrote: > On Wed, 22 Nov 2023 16:08:50 +0200 Vladimir Oltean wrote: > > My understanding of Jakub's email was that he wants to see the functionality > > offered by SIOCGHWTSTAMP and SIOCSHWTSTAMP converted to netlink. I don't > > think that ethtool is the correct netlink family for that, given that > > these aren't ethtool ioctls to begin with. Maybe the new netdev netlink > > family. The conversion in its basic form would offer exactly the same > > functionality. > > Well, ethtool has been the catch all for a lot of random things > for the longest time. The question is whether we want to extend > ETHTOOL_GET_TS_INFO or add a third API somewhere else. And if we > do - do we also duplicate the functionality of ETHTOOL_GET_TS_INFO > (i.e. getting capabilities)? > > My vote is that keeping it in ethtool is less bad than 3rd API. With SIOCSHWTSTAMP also implemented by CAN (and presumably also by wireless in the future), I do wonder whether ethtool is the right place for the netlink conversion. I wouldn't suggest duplicating ETHTOOL_GET_TS_INFO towards the netdev netlink family. > > The concept of an "active phc_index" would not explicitly exist in the > > UAPI. Thus I'm not sure what's with this TSINFO_SET being floated around. > > The only thing would exist is a configurable rx_filter and tx_type per > > hwtstamp provider (aka "{phc_index, qualifier}"). User space will have > > to learn to select the hwtstamp provider it wants to configure through > > netlink, and use for its class of traffic. > > "Active provider" is the one that has TX_ON, rx != FILTER_NONE, right? In the "implicit" definition of an "active hwtstamp provider", yes.
On Wed, Nov 22, 2023 at 08:54:59AM -0800, Jakub Kicinski wrote: > On Wed, 22 Nov 2023 16:36:18 +0200 Vladimir Oltean wrote: > > @Jakub, for your long-term "MAC timestamps for PTP, DMA for everything else". > > How do you see this? I guess we need some sort of priority function in > > the UAPI between hwtstamp providers. > > > > And even with that, I think the enums that we currently have for filters > > are not specific enough. The most we could expose is: > > > > MAC provider DMA provider > > > > hwtstamp_rx_filters HWTSTAMP_FILTER_PTP_V2_EVENT HWTSTAMP_FILTER_ALL > > tx_type HWTSTAMP_TX_ON HWTSTAMP_TX_ON > > > > but it isn't clear: for PTP, does the DMA provider give you an RX > > timestamp too? > > If we phrase it as "precise / approximate" rather than "MAC / DMA" - it > seems fairly intuitive to give the best timestamp available for a given > packet, no? I wouldn't be so sure. The alternative interpretation "for PTP, give me timestamps from both sources" also sounds reasonable for the distant future where that will be possible (with proper cmsg identification). But I don't see how to distinguish the two - the filters, expressed in these terms, would be the same. > > What about a TX timestamp? > > I was thinking - socket flag to make packets for a given socket request > precise timestamps. So the ptp4l source code would have to be modified to still work with the same precision as before? I'm not seeing this through.
On Wed, 22 Nov 2023 18:59:55 +0200 Vladimir Oltean wrote: > I wouldn't be so sure. The alternative interpretation "for PTP, give me > timestamps from both sources" also sounds reasonable for the distant > future where that will be possible (with proper cmsg identification). > But I don't see how to distinguish the two - the filters, expressed in > these terms, would be the same. We can add an attribute that explicitly says that the configuration is only requesting one stamp. But feels like jumping the gun at this stage, given we have no other option to express there. > So the ptp4l source code would have to be modified to still work with > the same precision as before? I'm not seeing this through. We can do the opposite and add a socket flag which says "DMA is okay".
On Wed, 22 Nov 2023 18:55:17 +0200 Vladimir Oltean wrote: > > Well, ethtool has been the catch all for a lot of random things > > for the longest time. The question is whether we want to extend > > ETHTOOL_GET_TS_INFO or add a third API somewhere else. And if we > > do - do we also duplicate the functionality of ETHTOOL_GET_TS_INFO > > (i.e. getting capabilities)? > > > > My vote is that keeping it in ethtool is less bad than 3rd API. > > With SIOCSHWTSTAMP also implemented by CAN (and presumably also by > wireless in the future), I do wonder whether ethtool is the right place > for the netlink conversion. ethtool currently provides the only way we have to configure ring length, ring count, RSS, UDP tunnels etc. It's a matter of taste, IMO ethtool is a bit of a lost cause already and keeping things together (ethtool already has TS_INFO) is cleaner than spreading them around. > I wouldn't suggest duplicating ETHTOOL_GET_TS_INFO towards the netdev > netlink family. FTR so far the netdev family is all about SW configuration. We should probably keep it that way, so it doesn't become ginormous. It's easy enough to create a new family, if needed.
On Wed, Nov 22, 2023 at 12:55 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Wed, 22 Nov 2023 18:59:55 +0200 Vladimir Oltean wrote: > > I wouldn't be so sure. The alternative interpretation "for PTP, give me > > timestamps from both sources" also sounds reasonable for the distant > > future where that will be possible (with proper cmsg identification). > > But I don't see how to distinguish the two - the filters, expressed in > > these terms, would be the same. > > We can add an attribute that explicitly says that the configuration > is only requesting one stamp. But feels like jumping the gun at this > stage, given we have no other option to express there. > > > So the ptp4l source code would have to be modified to still work with > > the same precision as before? I'm not seeing this through. > > We can do the opposite and add a socket flag which says "DMA is okay". There already is a disconnect between configuring hardware timestamp generation. Through the ioctl, which is a global admin-only interface. And requesting timestamps with SO_TIMESTAMPING. Today the user of ptp4l already has to know that the admin has configured the right RX and TX filters. That is no different if multiple filters can be installed? (PHY for PTP, DMA for everything else). If attribution becomes important, we could add another cmsg alongside the timestamp. On TX this already happens with IP_RECVERR/IPV6_RECVERR/PACKET_TX_TIMESTAMP. Maybe the sock_extended_err struct even still has a field that can be (ab)used for this purpose. Being able to pass multiple timestamps up to userspace eventually will be interesting. A large blocker is where to store these values in the sk_buff on the path between the driver and the socket (skb_ext?). At Google we already have this scenario, where the local TCP stack and userspace both want converted hardware timestamps -- but converted from raw to different timebases.
On Wed, 22 Nov 2023 10:01:42 -0800 Jakub Kicinski <kuba@kernel.org> wrote: > On Wed, 22 Nov 2023 18:55:17 +0200 Vladimir Oltean wrote: > > > Well, ethtool has been the catch all for a lot of random things > > > for the longest time. The question is whether we want to extend > > > ETHTOOL_GET_TS_INFO or add a third API somewhere else. And if we > > > do - do we also duplicate the functionality of ETHTOOL_GET_TS_INFO > > > (i.e. getting capabilities)? > > > > > > My vote is that keeping it in ethtool is less bad than 3rd API. > > > > With SIOCSHWTSTAMP also implemented by CAN (and presumably also by > > wireless in the future), I do wonder whether ethtool is the right place > > for the netlink conversion. > > ethtool currently provides the only way we have to configure ring > length, ring count, RSS, UDP tunnels etc. > > It's a matter of taste, IMO ethtool is a bit of a lost cause already > and keeping things together (ethtool already has TS_INFO) is cleaner > than spreading them around. > > > I wouldn't suggest duplicating ETHTOOL_GET_TS_INFO towards the netdev > > netlink family. > > FTR so far the netdev family is all about SW configuration. We should > probably keep it that way, so it doesn't become ginormous. It's easy > enough to create a new family, if needed. So, do we have a consensus? Vlad, do you agree on putting all under ethtool? ETHTOOL_GET_TS_INFO will be in charge of replacing the SIOCGHWSTAMP implementation. Need to add ETHTOOL_A_TSINFO_PHC_INDEX ETHTOOL_A_TSINFO_QUALIFIER to the request. ETHTOOL_GET_TS_INFO will list all the hwtstamp provider (aka "{phc_index, qualifier}") through the dumpit callback. I will add a filter to be able to list only the hwtstamp provider of one netdev. ETHTOOL_SET_TS_INFO will be in charge of replacing the SIOCSHWSTAMP implementation. Is that ok for you? Regards,
On Thu, 23 Nov 2023 16:00:56 +0100 Köry Maincent wrote: > > FTR so far the netdev family is all about SW configuration. We should > > probably keep it that way, so it doesn't become ginormous. It's easy > > enough to create a new family, if needed. > > So, do we have a consensus? Vlad, do you agree on putting all under ethtool? If not we can do a vote/poll? Maybe others don't find the configuration of timestamping as confusing as me.
On Thu, Nov 23, 2023 at 09:32:05AM -0800, Jakub Kicinski wrote: > On Thu, Nov 23, 2023 at 04:00:56PM +0100, Köry Maincent wrote: > > So, do we have a consensus? Vlad, do you agree on putting all under ethtool? > > > > ETHTOOL_GET_TS_INFO will be in charge of replacing the SIOCGHWSTAMP > > implementation. Need to add ETHTOOL_A_TSINFO_PHC_INDEX > > ETHTOOL_A_TSINFO_QUALIFIER to the request. > > > > ETHTOOL_GET_TS_INFO will list all the hwtstamp provider (aka "{phc_index, > > qualifier}") through the dumpit callback. I will add a filter to be able to > > list only the hwtstamp provider of one netdev. > > > > ETHTOOL_SET_TS_INFO will be in charge of replacing the SIOCSHWSTAMP > > implementation. > > If not we can do a vote/poll? Maybe others don't find the configuration > of timestamping as confusing as me. If you mean the ETHTOOL_MSG_TSINFO_GET netlink message (ETHTOOL_GET_TS_INFO is an ioctl), you're saying that you want to move the entire contents of SIOCGHWSTAMP there, by making the kernel call ndo_hwtstamp_get() in addition to the existing __ethtool_get_ts_info()? Yeah, I don't know, I don't have a real objection, I guess it's fine. What will be a bit of an "?!" moment for users is when ethtool gains support for the SIOCGHWSTAMP/SIOCSHWSTAMP netlink replacements, but not for the original ioctls. So hwstamp_ctl will be able to change timestamping configuration, but ethtool wouldn't - all on the same system. Unless ethtool gains an ioctl fallback for a ioctl that was never down its alley. But by all means, still hold a poll if you want to. I would vote for ethtool netlink, not because it's great, just because I don't have a better alternative to propose.
Hi Willem, On Wed, Nov 22, 2023 at 01:11:02PM -0500, Willem de Bruijn wrote: > There already is a disconnect between configuring hardware timestamp > generation. Through the ioctl, which is a global admin-only interface. > And requesting timestamps with SO_TIMESTAMPING. > > Today the user of ptp4l already has to know that the admin has > configured the right RX and TX filters. That is no different if > multiple filters can be installed? (PHY for PTP, DMA for everything > else). Are you saying that ptp4l doesn't configure the RX and TX filters by itself, just the admin had to do that? Because it does. https://github.com/richardcochran/linuxptp/blob/master/sk.c#L59 I'm not seeing the disconnect. SO_TIMESTAMPING is for the socket, SIOCSHWTSTAMP is for the configuration at the device level. It _is_ different if multiple filters can be installed, because either we let things be (and ptp4l issues the same ioctl which affects the default hwtstamp provider, which may or may not coincide with what we intend), or we teach ptp4l to deal with the multitude of providers that a port may have.
On Fri, 24 Nov 2023 17:43:43 +0200 Vladimir Oltean <vladimir.oltean@nxp.com> wrote: > On Thu, Nov 23, 2023 at 09:32:05AM -0800, Jakub Kicinski wrote: > > On Thu, Nov 23, 2023 at 04:00:56PM +0100, Köry Maincent wrote: > > > So, do we have a consensus? Vlad, do you agree on putting all under > > > ethtool? > > > > > > ETHTOOL_GET_TS_INFO will be in charge of replacing the SIOCGHWSTAMP > > > implementation. Need to add ETHTOOL_A_TSINFO_PHC_INDEX > > > ETHTOOL_A_TSINFO_QUALIFIER to the request. > > > > > > ETHTOOL_GET_TS_INFO will list all the hwtstamp provider (aka "{phc_index, > > > qualifier}") through the dumpit callback. I will add a filter to be able > > > to list only the hwtstamp provider of one netdev. > > > > > > ETHTOOL_SET_TS_INFO will be in charge of replacing the SIOCSHWSTAMP > > > implementation. > > > > If not we can do a vote/poll? Maybe others don't find the configuration > > of timestamping as confusing as me. > > If you mean the ETHTOOL_MSG_TSINFO_GET netlink message (ETHTOOL_GET_TS_INFO > is an ioctl), you're saying that you want to move the entire contents of > SIOCGHWSTAMP there, by making the kernel call ndo_hwtstamp_get() in > addition to the existing __ethtool_get_ts_info()? Yes. > Yeah, I don't know, I don't have a real objection, I guess it's fine. > > What will be a bit of an "?!" moment for users is when ethtool gains > support for the SIOCGHWSTAMP/SIOCSHWSTAMP netlink replacements, but not > for the original ioctls. So hwstamp_ctl will be able to change timestamping > configuration, but ethtool wouldn't - all on the same system. Unless > ethtool gains an ioctl fallback for a ioctl that was never down its alley. Yes indeed. Would it break things if both ioctls and netlink can get and set the hwtstamps configuration? It is only configuration. Both happen under rtnl_lock it should be alright. The question is which hwtstamp provider will the original ioctls be able to change? Maybe the default one (MAC with phy whitelist) and only this one. > But by all means, still hold a poll if you want to. I would vote for > ethtool netlink, not because it's great, just because I don't have a > better alternative to propose. If you agree on that choice, let's go. Jakub and your are the most proactive reviewers in this patch series. Willem you are the timestamping maintainer do you also agree on this? If anyone have another proposition let them speak now, or forever remain silent! ;) Regards,
On Fri, Nov 24, 2023 at 12:28 PM Vladimir Oltean <vladimir.oltean@nxp.com> wrote: > > Hi Willem, > > On Wed, Nov 22, 2023 at 01:11:02PM -0500, Willem de Bruijn wrote: > > There already is a disconnect between configuring hardware timestamp > > generation. Through the ioctl, which is a global admin-only interface. > > And requesting timestamps with SO_TIMESTAMPING. > > > > Today the user of ptp4l already has to know that the admin has > > configured the right RX and TX filters. That is no different if > > multiple filters can be installed? (PHY for PTP, DMA for everything > > else). > > Are you saying that ptp4l doesn't configure the RX and TX filters by > itself, just the admin had to do that? Because it does. > https://github.com/richardcochran/linuxptp/blob/master/sk.c#L59 > > I'm not seeing the disconnect. SO_TIMESTAMPING is for the socket, > SIOCSHWTSTAMP is for the configuration at the device level. > > It _is_ different if multiple filters can be installed, because either > we let things be (and ptp4l issues the same ioctl which affects the > default hwtstamp provider, which may or may not coincide with what we > intend), or we teach ptp4l to deal with the multitude of providers that > a port may have. I see. By disconnect, I meant that the socket option is unprivileged and can be set by many processes, while the ioctl is a global privileged setting, so must be under control of a single admin. But I did not know that ptp4l can take on both those roles for PTP. Perhaps multiple SIOCSHWTSTAMP rules can coexist, up to one per level: HWTSTAMP_FILTER_PTP_V2_EVENT, level=PHY HWTSTAMP_FILTER_ALL, level, level=DMA Then ptp4l can manage all levels except the DMA level. And DMA timestamps can be configured independently by another admin. If only one timestamp can be communicated to the host, the earliest match must takes precedence. Jakub pointed out how one device handles this by having a separate queue for PHY timestamped packets. This does not address the issue that packets with different precision skb_shinfo(skb)->hwtstamps->hwtstamp may now exist in the system. All packets reaching ptp4l sockets must have a high resolution source, but there is no explicit annotation to ensure or check this. This is fully based on trusting the HWSTAMP_FILTER. Expanding the skb infra and cmsg might be follow-on work.
On Fri, Nov 24, 2023 at 06:34:31PM +0100, Köry Maincent wrote: > Would it break things if both ioctls and netlink can get and set the > hwtstamps configuration? Uhm, obviously? It would break things if ioctl and netlink were _not_ freely interchangeable, and you couldn't see in a ioctl GET what got set through a netlink SET. > It is only configuration. Both happen under rtnl_lock it should be > alright. Yeah, but you always need to keep the API interchangeability in mind during the implementation. > The question is which hwtstamp provider will the original ioctls be able to > change? Maybe the default one (MAC with phy whitelist) and only this one. TL;DR: yeah. Remember one single rule and go from there: new development should not change established setups. So SIOCSHWSTAMPs should continue to behave "as before". This is also the exact reason why I asked for the phy whitelist. The introduction of CONFIG_NETWORK_PHY_TIMESTAMPING introduced exactly that: a breaking change in the mode in which deployed setups operate. > > But by all means, still hold a poll if you want to. I would vote for > > ethtool netlink, not because it's great, just because I don't have a > > better alternative to propose. > > If you agree on that choice, let's go. Jakub and your are the most proactive > reviewers in this patch series. Willem you are the timestamping maintainer do > you also agree on this? > If anyone have another proposition let them speak now, or forever remain > silent! ;) Hmm, proactive means doing stuff in anticipation of being requested to do it. I'd use the work "active" at most...
On Fri, Nov 24, 2023 at 12:34 PM Köry Maincent <kory.maincent@bootlin.com> wrote: > > On Fri, 24 Nov 2023 17:43:43 +0200 > Vladimir Oltean <vladimir.oltean@nxp.com> wrote: > > > On Thu, Nov 23, 2023 at 09:32:05AM -0800, Jakub Kicinski wrote: > > > On Thu, Nov 23, 2023 at 04:00:56PM +0100, Köry Maincent wrote: > > > > So, do we have a consensus? Vlad, do you agree on putting all under > > > > ethtool? > > > > > > > > ETHTOOL_GET_TS_INFO will be in charge of replacing the SIOCGHWSTAMP > > > > implementation. Need to add ETHTOOL_A_TSINFO_PHC_INDEX > > > > ETHTOOL_A_TSINFO_QUALIFIER to the request. > > > > > > > > ETHTOOL_GET_TS_INFO will list all the hwtstamp provider (aka "{phc_index, > > > > qualifier}") through the dumpit callback. I will add a filter to be able > > > > to list only the hwtstamp provider of one netdev. > > > > > > > > ETHTOOL_SET_TS_INFO will be in charge of replacing the SIOCSHWSTAMP > > > > implementation. > > > > > > If not we can do a vote/poll? Maybe others don't find the configuration > > > of timestamping as confusing as me. > > > > If you mean the ETHTOOL_MSG_TSINFO_GET netlink message (ETHTOOL_GET_TS_INFO > > is an ioctl), you're saying that you want to move the entire contents of > > SIOCGHWSTAMP there, by making the kernel call ndo_hwtstamp_get() in > > addition to the existing __ethtool_get_ts_info()? > > Yes. > > > Yeah, I don't know, I don't have a real objection, I guess it's fine. > > > > What will be a bit of an "?!" moment for users is when ethtool gains > > support for the SIOCGHWSTAMP/SIOCSHWSTAMP netlink replacements, but not > > for the original ioctls. So hwstamp_ctl will be able to change timestamping > > configuration, but ethtool wouldn't - all on the same system. Unless > > ethtool gains an ioctl fallback for a ioctl that was never down its alley. > > Yes indeed. Would it break things if both ioctls and netlink can get and set > the hwtstamps configuration? It is only configuration. Both happen under > rtnl_lock it should be alright. > > The question is which hwtstamp provider will the original ioctls be able to > change? Maybe the default one (MAC with phy whitelist) and only this one. > > > But by all means, still hold a poll if you want to. I would vote for > > ethtool netlink, not because it's great, just because I don't have a > > better alternative to propose. > > If you agree on that choice, let's go. Jakub and your are the most proactive > reviewers in this patch series. Willem you are the timestamping maintainer do > you also agree on this? I don't have a strong opinion. Ethtool netlink SGTM. For new network configuration we are moving away from ioctl towards netlink in general. Ethtool itself made this move, where the old ioctl way of things continues to work, but will no longer be extended. Since one of the APIs we use already uses ethtool, converting the other two there makes sense to me. I'm not familiar enough with configuring CAN or wireless to know whether it would pose a problem for these mentioned cases. > If anyone have another proposition let them speak now, or forever remain > silent! ;)
On Fri, Nov 24, 2023 at 02:45:46PM -0500, Willem de Bruijn wrote:
> Expanding the skb infra and cmsg might be follow-on work.
Yes, and would I suggest limiting the scope of the present series just
to allow changing the global, device-level time stamping layer
administratively.
Trying to support multiple time stamps from different layers is a much
larger development project, and it will require new kernel/user
interfaces.
(Of course it would be grand if the series is forward looking to the
day when time stamp reporting expands beyond the current hw/sw cmsg.)
Thanks,
Richard
On Mon, 20 Nov 2023 13:45:51 -0800 Jakub Kicinski <kuba@kernel.org> wrote: > On Mon, 20 Nov 2023 21:58:58 +0200 Vladimir Oltean wrote: > > I'm still waiting for you to fully clarify the "per socket vs global" > > aspect, but independently of that, at least I understand why this is a > > counter-argument to my proposal. I need to tune it a bit (ASSUMING that > > we want DMA timestamps to "look like" hwtimestamps, and not like their > > own thing, to user space), because the PHC index would no longer fully > > identify a hwtstamp provider, so we need something more. > > > > I imagine both ETHTOOL_MSG_TSINFO_GET and ETHTOOL_MSG_TSINFO_SET to > > support a new (nest) nlattr called ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER. > > > > This would contain (u32) ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER_PHC_INDEX > > and (u32) ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER_QUALIFIER. It could be > > extensible in the future, but this is the baseline and forms the key. > > > > The latter takes values from an: > > > > enum ethtool_hwstamp_provider_qualifier { > > ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_MAC, > > ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_PHY, > > ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_DMA, > > }; > > Sounds reasonable. Having more attributes than just PHC index works. > Given the lack of distinction between MAC and PHY for integrated NICs > I'd lean towards ditching the "layers" completely and exposing > an "approximate" vs "precise" boolean. Approximate being the DMA point > for NICs, but more generically a point that is separated from the wire > by buffering or other variable length delay. Precise == IEEE 1588 > quality. Hello Jakub, just wondering. I can add this hwtstamp provider qualifier in the next series version but it won't be used as it is set and used at the driver level and no driver is using it for now. It would not be accepted if I use something that is not used, right? Do you still think I should add this in v8? Regards,
On Wed, Nov 29, 2023 at 09:09:59PM +0100, Köry Maincent wrote: > On Mon, 20 Nov 2023 13:45:51 -0800 > Jakub Kicinski <kuba@kernel.org> wrote: > > > On Mon, 20 Nov 2023 21:58:58 +0200 Vladimir Oltean wrote: > > > I'm still waiting for you to fully clarify the "per socket vs global" > > > aspect, but independently of that, at least I understand why this is a > > > counter-argument to my proposal. I need to tune it a bit (ASSUMING that > > > we want DMA timestamps to "look like" hwtimestamps, and not like their > > > own thing, to user space), because the PHC index would no longer fully > > > identify a hwtstamp provider, so we need something more. > > > > > > I imagine both ETHTOOL_MSG_TSINFO_GET and ETHTOOL_MSG_TSINFO_SET to > > > support a new (nest) nlattr called ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER. > > > > > > This would contain (u32) ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER_PHC_INDEX > > > and (u32) ETHTOOL_A_TSINFO_HWSTAMP_PROVIDER_QUALIFIER. It could be > > > extensible in the future, but this is the baseline and forms the key. > > > > > > The latter takes values from an: > > > > > > enum ethtool_hwstamp_provider_qualifier { > > > ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_MAC, > > > ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_PHY, > > > ETHTOOL_HWSTAMP_PROVIDER_QUALIFIER_DMA, > > > }; > > > > Sounds reasonable. Having more attributes than just PHC index works. > > Given the lack of distinction between MAC and PHY for integrated NICs > > I'd lean towards ditching the "layers" completely and exposing > > an "approximate" vs "precise" boolean. Approximate being the DMA point > > for NICs, but more generically a point that is separated from the wire > > by buffering or other variable length delay. Precise == IEEE 1588 > > quality. > > Hello Jakub, just wondering. > I can add this hwtstamp provider qualifier in the next series version but it > won't be used as it is set and used at the driver level and no driver is using > it for now. It would not be accepted if I use something that is not used, right? > Do you still think I should add this in v8? Not sure why you say "not used", though. Are you not planning to expose the qualifier as an attribute to the listing of hwtstamp providers offered to user space by ETHTOOL_MSG_TSINFO_GET? Personally, I worry that if the qualifier gets added later (not now) to the UAPI, we will end up having user space software (written now) that iterates through the provider listing thinking that there may only ever be one provider offered by one PHC, and will stop at the first such provider found, whichever that may be. With the added qualifier, there's a higher chance that user space searches will be for a {phc, qualifier} pair (even if there will only be 1 possible qualifier type), and the future addition of a new hwtstamp provider will keep existing software working in the same way as before, i.e. user space won't select the DMA provider by mistake, by ignoring the qualifier attribute altogether. Generally I'm against adding things upfront that can only be in a certain way, but in this case I believe that it is necessary in order for the future extensions that were discussed to be possible. The qualifier is part of the user space search key and thus pretty important. My 2 cents, Jakub can absolutely disagree.
On Wed, 29 Nov 2023 22:37:00 +0200 Vladimir Oltean <vladimir.oltean@nxp.com> wrote: > On Wed, Nov 29, 2023 at 09:09:59PM +0100, Köry Maincent wrote: > > On Mon, 20 Nov 2023 13:45:51 -0800 > > Jakub Kicinski <kuba@kernel.org> wrote: > > > Sounds reasonable. Having more attributes than just PHC index works. > > > Given the lack of distinction between MAC and PHY for integrated NICs > > > I'd lean towards ditching the "layers" completely and exposing > > > an "approximate" vs "precise" boolean. Approximate being the DMA point > > > for NICs, but more generically a point that is separated from the wire > > > by buffering or other variable length delay. Precise == IEEE 1588 > > > quality. > > > > Hello Jakub, just wondering. > > I can add this hwtstamp provider qualifier in the next series version but it > > won't be used as it is set and used at the driver level and no driver is > > using it for now. It would not be accepted if I use something that is not > > used, right? Do you still think I should add this in v8? > > Not sure why you say "not used", though. Are you not planning to expose > the qualifier as an attribute to the listing of hwtstamp providers > offered to user space by ETHTOOL_MSG_TSINFO_GET? Yes I will, I was just saying that all the PHC would be set as precise for now. Approximate timestamp quality won't be used because IIUC there are no NIC driver supporting it yet. > Personally, I worry that if the qualifier gets added later (not now) to > the UAPI, we will end up having user space software (written now) that > iterates through the provider listing thinking that there may only ever > be one provider offered by one PHC, and will stop at the first such > provider found, whichever that may be. > > With the added qualifier, there's a higher chance that user space > searches will be for a {phc, qualifier} pair (even if there will only be > 1 possible qualifier type), and the future addition of a new hwtstamp > provider will keep existing software working in the same way as before, > i.e. user space won't select the DMA provider by mistake, by ignoring > the qualifier attribute altogether. > > Generally I'm against adding things upfront that can only be in a certain > way, but in this case I believe that it is necessary in order for the > future extensions that were discussed to be possible. The qualifier is > part of the user space search key and thus pretty important. > > My 2 cents, Jakub can absolutely disagree. Alright, this seems relevant. Regards,
On Wed, 29 Nov 2023 23:00:34 +0100 Köry Maincent wrote: > > Not sure why you say "not used", though. Are you not planning to expose > > the qualifier as an attribute to the listing of hwtstamp providers > > offered to user space by ETHTOOL_MSG_TSINFO_GET? > > Yes I will, I was just saying that all the PHC would be set as precise for now. > Approximate timestamp quality won't be used because IIUC there are no NIC driver > supporting it yet. Agreed that we should add the attr from the start. Maybe we can ask/work with Rahul <rrameshbabu@nvidia.com> to implement the right thing in mlx5? Failing that we can mark mlx5 as imprecise, until its sorted out. So that we have both types in the tree.
On Wed, 29 Nov, 2023 15:56:13 -0800 Jakub Kicinski <kuba@kernel.org> wrote: > On Wed, 29 Nov 2023 23:00:34 +0100 Köry Maincent wrote: >> > Not sure why you say "not used", though. Are you not planning to expose >> > the qualifier as an attribute to the listing of hwtstamp providers >> > offered to user space by ETHTOOL_MSG_TSINFO_GET? >> >> Yes I will, I was just saying that all the PHC would be set as precise for now. >> Approximate timestamp quality won't be used because IIUC there are no NIC driver >> supporting it yet. > > Agreed that we should add the attr from the start. > > Maybe we can ask/work with Rahul <rrameshbabu@nvidia.com> > to implement the right thing in mlx5? Thanks for looping me in. We were already looking at this patch series out of interest. I saw your suggestion to rephrase "MAC / DMA" as "precise / approximate", which we really like for mlx5 devices because our "approximate" timestamping logic is not exactly a "MAC" timestamp but its not a port timestamp that has the greater precision we use. I have a task already for implementing support for this ethtool attribute. If folks here are open to it, I can add mlx5 support for both modes in this patch series for the next revision that will entail the discussed changes. > > Failing that we can mark mlx5 as imprecise, until its sorted out. > So that we have both types in the tree. -- Thanks, Rahul Rameshbabu
diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst index b8d00676ed82..530c1775e5f4 100644 --- a/Documentation/networking/ethtool-netlink.rst +++ b/Documentation/networking/ethtool-netlink.rst @@ -227,6 +227,7 @@ Userspace to kernel: ``ETHTOOL_MSG_MM_SET`` set MAC merge layer parameters ``ETHTOOL_MSG_TS_GET`` get current timestamping ``ETHTOOL_MSG_TS_LIST_GET`` list available timestampings + ``ETHTOOL_MSG_TS_SET`` set current timestamping ===================================== ================================= Kernel to userspace: @@ -2038,6 +2039,21 @@ Kernel response contents: This command lists all the possible timestamp layer available. +TS_SET +====== + +Modify the selected timestamping. + +Request contents: + + ======================= ====== =================== + ``ETHTOOL_A_TS_HEADER`` nested reply header + ``ETHTOOL_A_TS_LAYER`` u32 timestamping + ======================= ====== =================== + +This command set the timestamping with one that should be listed by the +TSLIST_GET command. + Request translation =================== @@ -2146,4 +2162,5 @@ are netlink only. n/a ``ETHTOOL_MSG_MM_SET`` n/a ``ETHTOOL_MSG_TS_GET`` n/a ``ETHTOOL_MSG_TS_LIST_GET`` + n/a ``ETHTOOL_MSG_TS_SET`` =================================== ===================================== diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index 62b885d44d06..df6c4fcc62c1 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -59,6 +59,7 @@ enum { ETHTOOL_MSG_MM_SET, ETHTOOL_MSG_TS_GET, ETHTOOL_MSG_TS_LIST_GET, + ETHTOOL_MSG_TS_SET, /* add new constants above here */ __ETHTOOL_MSG_USER_CNT, diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c index 842c9db1531f..8322bf71f80d 100644 --- a/net/ethtool/netlink.c +++ b/net/ethtool/netlink.c @@ -308,6 +308,7 @@ ethnl_default_requests[__ETHTOOL_MSG_USER_CNT] = { [ETHTOOL_MSG_MM_SET] = ðnl_mm_request_ops, [ETHTOOL_MSG_TS_GET] = ðnl_ts_request_ops, [ETHTOOL_MSG_TS_LIST_GET] = ðnl_ts_list_request_ops, + [ETHTOOL_MSG_TS_SET] = ðnl_ts_request_ops, }; static struct ethnl_dump_ctx *ethnl_dump_context(struct netlink_callback *cb) @@ -1148,6 +1149,13 @@ static const struct genl_ops ethtool_genl_ops[] = { .policy = ethnl_ts_get_policy, .maxattr = ARRAY_SIZE(ethnl_ts_get_policy) - 1, }, + { + .cmd = ETHTOOL_MSG_TS_SET, + .flags = GENL_UNS_ADMIN_PERM, + .doit = ethnl_default_set_doit, + .policy = ethnl_ts_set_policy, + .maxattr = ARRAY_SIZE(ethnl_ts_set_policy) - 1, + }, }; static const struct genl_multicast_group ethtool_nl_mcgrps[] = { diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h index ea8c312db3af..8fedf234b824 100644 --- a/net/ethtool/netlink.h +++ b/net/ethtool/netlink.h @@ -444,6 +444,7 @@ extern const struct nla_policy ethnl_plca_get_status_policy[ETHTOOL_A_PLCA_HEADE extern const struct nla_policy ethnl_mm_get_policy[ETHTOOL_A_MM_HEADER + 1]; extern const struct nla_policy ethnl_mm_set_policy[ETHTOOL_A_MM_MAX + 1]; extern const struct nla_policy ethnl_ts_get_policy[ETHTOOL_A_TS_HEADER + 1]; +extern const struct nla_policy ethnl_ts_set_policy[ETHTOOL_A_TS_MAX + 1]; int ethnl_set_features(struct sk_buff *skb, struct genl_info *info); int ethnl_act_cable_test(struct sk_buff *skb, struct genl_info *info); diff --git a/net/ethtool/ts.c b/net/ethtool/ts.c index bd219512b8de..357265e74e08 100644 --- a/net/ethtool/ts.c +++ b/net/ethtool/ts.c @@ -59,6 +59,102 @@ static int ts_fill_reply(struct sk_buff *skb, return nla_put_u32(skb, ETHTOOL_A_TS_LAYER, data->ts_layer); } +/* TS_SET */ +const struct nla_policy ethnl_ts_set_policy[] = { + [ETHTOOL_A_TS_HEADER] = NLA_POLICY_NESTED(ethnl_header_policy), + [ETHTOOL_A_TS_LAYER] = NLA_POLICY_RANGE(NLA_U32, 0, + __TIMESTAMPING_COUNT - 1) +}; + +static int ethnl_set_ts_validate(struct ethnl_req_info *req_info, + struct genl_info *info) +{ + struct nlattr **tb = info->attrs; + const struct net_device_ops *ops = req_info->dev->netdev_ops; + + if (!ops->ndo_hwtstamp_set) + return -EOPNOTSUPP; + + if (!tb[ETHTOOL_A_TS_LAYER]) + return 0; + + return 1; +} + +static int ethnl_set_ts(struct ethnl_req_info *req_info, struct genl_info *info) +{ + struct net_device *dev = req_info->dev; + const struct ethtool_ops *ops = dev->ethtool_ops; + struct kernel_hwtstamp_config config = {0}; + struct nlattr **tb = info->attrs; + enum timestamping_layer ts_layer; + bool mod = false; + int ret; + + ts_layer = dev->ts_layer; + ethnl_update_u32(&ts_layer, tb[ETHTOOL_A_TS_LAYER], &mod); + + if (!mod) + return 0; + + if (ts_layer == SOFTWARE_TIMESTAMPING) { + struct ethtool_ts_info ts_info = {0}; + + if (!ops->get_ts_info) { + NL_SET_ERR_MSG_ATTR(info->extack, + tb[ETHTOOL_A_TS_LAYER], + "this net device cannot support timestamping"); + return -EINVAL; + } + + ops->get_ts_info(dev, &ts_info); + if ((ts_info.so_timestamping & + SOF_TIMESTAMPING_SOFTWARE_MASK) != + SOF_TIMESTAMPING_SOFTWARE_MASK) { + NL_SET_ERR_MSG_ATTR(info->extack, + tb[ETHTOOL_A_TS_LAYER], + "this net device cannot support software timestamping"); + return -EINVAL; + } + } else if (ts_layer == MAC_TIMESTAMPING) { + struct ethtool_ts_info ts_info = {0}; + + if (!ops->get_ts_info) { + NL_SET_ERR_MSG_ATTR(info->extack, + tb[ETHTOOL_A_TS_LAYER], + "this net device cannot support timestamping"); + return -EINVAL; + } + + ops->get_ts_info(dev, &ts_info); + if ((ts_info.so_timestamping & + SOF_TIMESTAMPING_HARDWARE_MASK) != + SOF_TIMESTAMPING_HARDWARE_MASK) { + NL_SET_ERR_MSG_ATTR(info->extack, + tb[ETHTOOL_A_TS_LAYER], + "this net device cannot support hardware timestamping"); + return -EINVAL; + } + } else if (ts_layer == PHY_TIMESTAMPING && !phy_has_tsinfo(dev->phydev)) { + NL_SET_ERR_MSG_ATTR(info->extack, tb[ETHTOOL_A_TS_LAYER], + "this phy device cannot support timestamping"); + return -EINVAL; + } + + /* Disable time stamping in the current layer. */ + if (netif_device_present(dev) && + (dev->ts_layer == PHY_TIMESTAMPING || + dev->ts_layer == MAC_TIMESTAMPING)) { + ret = dev_set_hwtstamp_phylib(dev, &config, info->extack); + if (ret < 0) + return ret; + } + + dev->ts_layer = ts_layer; + + return 1; +} + const struct ethnl_request_ops ethnl_ts_request_ops = { .request_cmd = ETHTOOL_MSG_TS_GET, .reply_cmd = ETHTOOL_MSG_TS_GET_REPLY, @@ -69,6 +165,9 @@ const struct ethnl_request_ops ethnl_ts_request_ops = { .prepare_data = ts_prepare_data, .reply_size = ts_reply_size, .fill_reply = ts_fill_reply, + + .set_validate = ethnl_set_ts_validate, + .set = ethnl_set_ts, }; /* TS_LIST_GET */
Now that the current timestamp is saved in a variable lets add the ETHTOOL_MSG_TS_SET ethtool netlink socket to make it selectable. Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> --- Changes in v2: - Move selected_timestamping_layer introduction in this patch. - Replace strmcmp by sysfs_streq. - Use the PHY timestamp only if available. Changes in v3: - Add a devicetree binding to select the preferred timestamp - Replace the way to select timestamp through ethtool instead of sysfs You can test it with the ethtool source on branch feature_ptp of: https://github.com/kmaincent/ethtool Changes in v4: - Change the API to select MAC default time stamping instead of the PHY. - Add a whitelist to no break the old timestamp PHY default preferences for current PHY. - Replace the ethtool ioctl by netlink. - Add a netdev notifier to allow network device to create trap on PTP packets. Not tested as it need to change the lan966x driver that implement packet traps. I will do after the hwtstamp management change to NDOs. Changes in v5: - Remove the netdev notifier added in v4. - Extract the default timestamp API change in another patch. Changes in v6: - Update the error message. - Put ndo_hwtstamp_set check first as it is most likely what most drivers currently do not support. - Follow timestamping layer naming update. - Update the timestamp layer check. - Update timestamp set between MAC and software. Changes in v7: - Fix commit title typo. --- Documentation/networking/ethtool-netlink.rst | 17 +++++ include/uapi/linux/ethtool_netlink.h | 1 + net/ethtool/netlink.c | 8 +++ net/ethtool/netlink.h | 1 + net/ethtool/ts.c | 99 ++++++++++++++++++++++++++++ 5 files changed, 126 insertions(+)