Message ID | 20231009155138.86458-9-kory.maincent@bootlin.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: Make timestamping selectable | expand |
On 10/9/23 08:51, Köry Maincent wrote: > From: Kory Maincent <kory.maincent@bootlin.com> > > Time stamping on network packets may happen either in the MAC or in > the PHY, but not both. In preparation for making the choice > selectable, expose both the current and available layers via ethtool. > > In accordance with the kernel implementation as it stands, the current > layer will always read as "phy" when a PHY time stamping device is > present. Future patches will allow changing the current layer > administratively. > > Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> > > --- [snip] > +/* > + * Hardware layer of the TIMESTAMPING provider > + * New description layer should have the NETDEV_TIMESTAMPING or > + * PHYLIB_TIMESTAMPING bit set to know which API to use for timestamping. If we are talking about hardware layers, then we shall use either PHY_TIMESTAMPING or MAC_TIMESTAMPING. PHYLIB is the sub-subsystem to deal with Ethernet PHYs, and netdev is the object through which we represent network devices, so they are not even quite describing similar things. If you go with the {PHY,MAC}_TIMESTAMPING suggestion, then I could see how we could somewhat easily add PCS_TIMESTAMPING for instance. > + */ > +enum { > + NO_TIMESTAMPING = 0, > + NETDEV_TIMESTAMPING = (1 << 0), > + PHYLIB_TIMESTAMPING = (1 << 1), > + SOFTWARE_TIMESTAMPING = (1 << 2) | (1 << 0), Why do we have to set NETDEV_TIMESTAMPING here, or is this a round-about way of enumerating 0, 1, 2 and 3?
On Mon, 9 Oct 2023 14:20:02 -0700 Florian Fainelli <florian.fainelli@broadcom.com> wrote: Hello Florian, Thanks for your review! > > +/* > > + * Hardware layer of the TIMESTAMPING provider > > + * New description layer should have the NETDEV_TIMESTAMPING or > > + * PHYLIB_TIMESTAMPING bit set to know which API to use for timestamping. > > If we are talking about hardware layers, then we shall use either > PHY_TIMESTAMPING or MAC_TIMESTAMPING. PHYLIB is the sub-subsystem to > deal with Ethernet PHYs, and netdev is the object through which we > represent network devices, so they are not even quite describing similar > things. If you go with the {PHY,MAC}_TIMESTAMPING suggestion, then I > could see how we could somewhat easily add PCS_TIMESTAMPING for instance. I am indeed talking about hardware layers but I updated the name to use NETDEV and PHYLIB timestamping for a reason. It is indeed only PHY or MAC timestamping for now but it may be expanded in the future to theoretically to 7 layers of timestamps possible. Also there may be several possible timestamp within a MAC device precision vs volume. See the thread of my last version that talk about it: https://lore.kernel.org/netdev/20230511203646.ihljeknxni77uu5j@skbuf/ All these possibles timestamps go through exclusively the netdev API or the phylib API. Even the software timestamping is done in the netdev driver, therefore it goes through the netdev API and then should have the NETDEV_TIMESTAMPING bit set. > > + */ > > +enum { > > + NO_TIMESTAMPING = 0, > > + NETDEV_TIMESTAMPING = (1 << 0), > > + PHYLIB_TIMESTAMPING = (1 << 1), > > + SOFTWARE_TIMESTAMPING = (1 << 2) | (1 << 0), > > Why do we have to set NETDEV_TIMESTAMPING here, or is this a round-about > way of enumerating 0, 1, 2 and 3? I answered you above the software timestamping should have the NETDEV_TIMESTAMPING bit set as it is done from the net device driver. What I was thinking is that all the new timestamping should have NETDEV_TIMESTAMPING or PHYLIB_TIMESTAMPING set to know which API to pass through. Like we could add these in the future: MAC_DMA_TIMESTAMPING = (2 << 2) | (1 >> 0), MAC_PRECISION_TIMESTAMPING = (3 << 2) | (1 >> 0), ... PHY_SFP_TIMESTAMPING = (2 << 2) | (1 << 1), ... Or maybe do you prefer to use defines like this: # define NETDEV_TIMESTAMPING (1 << 0) # define PHYLIB_TIMESTAMPING (1 << 1) enum { NO_TIMESTAMPING = 0, MAC_TIMESTAMPING = NETDEV_TIMESTAMPING, PHY_TIMESTAMPING = PHYLIB_TIMESTAMPING, SOFTWARE_TIMESTAMPING = (1 << 2) | NETDEV_TIMESTAMPING, ... MAC_DMA_TIMESTAMPING = (2 << 2) | NETDEV_TIMESTAMPING, MAC_PRECISION_TIMESTAMPING = (3 << 2) | NETDEV_TIMESTAMPING, or other idea? Regards,
On Tue, 10 Oct 2023 10:23:43 +0200 Köry Maincent wrote: > > > +/* > > > + * Hardware layer of the TIMESTAMPING provider > > > + * New description layer should have the NETDEV_TIMESTAMPING or > > > + * PHYLIB_TIMESTAMPING bit set to know which API to use for timestamping. > > > > If we are talking about hardware layers, then we shall use either > > PHY_TIMESTAMPING or MAC_TIMESTAMPING. PHYLIB is the sub-subsystem to > > deal with Ethernet PHYs, and netdev is the object through which we > > represent network devices, so they are not even quite describing similar > > things. If you go with the {PHY,MAC}_TIMESTAMPING suggestion, then I > > could see how we could somewhat easily add PCS_TIMESTAMPING for instance. > > I am indeed talking about hardware layers but I updated the name to use NETDEV > and PHYLIB timestamping for a reason. It is indeed only PHY or MAC timestamping > for now but it may be expanded in the future to theoretically to 7 layers of > timestamps possible. Also there may be several possible timestamp within a MAC > device precision vs volume. > See the thread of my last version that talk about it: > https://lore.kernel.org/netdev/20230511203646.ihljeknxni77uu5j@skbuf/ > > All these possibles timestamps go through exclusively the netdev API or the > phylib API. Even the software timestamping is done in the netdev driver, > therefore it goes through the netdev API and then should have the > NETDEV_TIMESTAMPING bit set. Netdev vs phylib is an implementation detail of Linux. I'm also surprised that you changed this. > > > + */ > > > +enum { > > > + NO_TIMESTAMPING = 0, > > > + NETDEV_TIMESTAMPING = (1 << 0), > > > + PHYLIB_TIMESTAMPING = (1 << 1), > > > + SOFTWARE_TIMESTAMPING = (1 << 2) | (1 << 0), > > > > Why do we have to set NETDEV_TIMESTAMPING here, or is this a round-about > > way of enumerating 0, 1, 2 and 3? > > I answered you above the software timestamping should have the > NETDEV_TIMESTAMPING bit set as it is done from the net device driver. > > What I was thinking is that all the new timestamping should have > NETDEV_TIMESTAMPING or PHYLIB_TIMESTAMPING set to know which API to pass > through. > Like we could add these in the future: > MAC_DMA_TIMESTAMPING = (2 << 2) | (1 >> 0), > MAC_PRECISION_TIMESTAMPING = (3 << 2) | (1 >> 0), > ... > PHY_SFP_TIMESTAMPING = (2 << 2) | (1 << 1), > ... What is "PRECISION"? DMA is a separate block like MAC and PHY.
> > All these possibles timestamps go through exclusively the netdev API or the > > phylib API. Even the software timestamping is done in the netdev driver, > > therefore it goes through the netdev API and then should have the > > NETDEV_TIMESTAMPING bit set. > > Netdev vs phylib is an implementation detail of Linux. > I'm also surprised that you changed this. > > > > > + */ > > > > +enum { > > > > + NO_TIMESTAMPING = 0, > > > > + NETDEV_TIMESTAMPING = (1 << 0), > > > > + PHYLIB_TIMESTAMPING = (1 << 1), > > > > + SOFTWARE_TIMESTAMPING = (1 << 2) | (1 << 0), Just emphasising Jakubs point here. phylib is an implementation detail, in that the MAC driver might be using firmware to drive its PHY, and that firmware can do a timestamp in the PHY. The API being defined here should be independent of the implementation details. So it probably should be MAC_TIMESTAMPING and PHY_TIMESTAMPING, and leave it to the driver to decide if its PHYLIB doing the actual work, or firmware. Andrew
On Fri, Oct 13, 2023 at 09:00:20AM -0700, Jakub Kicinski wrote: > On Tue, 10 Oct 2023 10:23:43 +0200 Köry Maincent wrote: > > > Why do we have to set NETDEV_TIMESTAMPING here, or is this a round-about > > > way of enumerating 0, 1, 2 and 3? > > > > I answered you above the software timestamping should have the > > NETDEV_TIMESTAMPING bit set as it is done from the net device driver. > > > > What I was thinking is that all the new timestamping should have > > NETDEV_TIMESTAMPING or PHYLIB_TIMESTAMPING set to know which API to pass > > through. > > Like we could add these in the future: > > MAC_DMA_TIMESTAMPING = (2 << 2) | (1 >> 0), > > MAC_PRECISION_TIMESTAMPING = (3 << 2) | (1 >> 0), > > ... > > PHY_SFP_TIMESTAMPING = (2 << 2) | (1 << 1), > > ... > > What is "PRECISION"? DMA is a separate block like MAC and PHY. If DMA is a separate block like MAC and PHY, can it have its own PHC device, and the ethtool UAPI only lists the timestamping-capable PHCs for one NIC, and is able to select between them? Translation between the UAPI-visible PHC index and MAC, DMA, phylib PHY, other PHY etc can then be done by the kernel as needed.
On Fri, 13 Oct 2023 19:14:46 +0300 Vladimir Oltean wrote: > > What is "PRECISION"? DMA is a separate block like MAC and PHY. > > If DMA is a separate block like MAC and PHY, can it have its own PHC > device, and the ethtool UAPI only lists the timestamping-capable PHCs > for one NIC, and is able to select between them? Possibly, I guess. There are some devices which use generic (i.e. modeled by Linux as separate struct device) DMA controllers to read out packets from "MAC" FIFOs. In practice I'm not sure if any of those DMA controllers has time stamping capabilities. > Translation between the UAPI-visible PHC index and MAC, DMA, phylib > PHY, other PHY etc can then be done by the kernel as needed. Translation by the kernel at which point? IMHO it'd indeed be clearer for the user to have an ability to read the PHC for SOF_..._DMA via ETHTOOL_MSG_TS_LIST_GET_REPLY as a separate entry, rather than e.g. assume that DMA uses the same PHC as MAC.
On Fri, Oct 13, 2023 at 09:30:56AM -0700, Jakub Kicinski wrote: > On Fri, 13 Oct 2023 19:14:46 +0300 Vladimir Oltean wrote: > > > What is "PRECISION"? DMA is a separate block like MAC and PHY. > > > > If DMA is a separate block like MAC and PHY, can it have its own PHC > > device, and the ethtool UAPI only lists the timestamping-capable PHCs > > for one NIC, and is able to select between them? > > Possibly, I guess. There are some devices which use generic (i.e. > modeled by Linux as separate struct device) DMA controllers to read > out packets from "MAC" FIFOs. In practice I'm not sure if any of those > DMA controllers has time stamping capabilities. The answer is not completely satisfactory, I guess. My proposal would only work if the common denominator for a hardware timestamp provider could be modeled as a struct ptp_clock, like we do for MAC and phylib PHYs already, and we call ptp_clock_index() to get the phc_index that serves as the UAPI token for it. > > Translation between the UAPI-visible PHC index and MAC, DMA, phylib > > PHY, other PHY etc can then be done by the kernel as needed. > > Translation by the kernel at which point? The gist of what I'm proposing is for the core ethtool netlink message handler to get just the phc_index as an attribute. No other information as to what it represents. Not that it's netdev, DMA, phylib PHY or whatnot. The ethtool kernel code would iterate through the stuff registered in the system for the netdev, calling get_ts_info() or phy_ts_info() on it, until it finds something which populates struct ethtool_ts_info :: phc_index with the phc_index retrieved from netlink. Then, ethtool just talks with the timestamper that matched that phc_index. Same idea would be applied for the command that lists all timestamping layers for a netdev. Check get_ts_info(), phy_ts_info(dev->phydev), and can be extended in the future. > IMHO it'd indeed be clearer for the user to have an ability to read > the PHC for SOF_..._DMA via ETHTOOL_MSG_TS_LIST_GET_REPLY as a separate > entry, rather than e.g. assume that DMA uses the same PHC as MAC. I'm not really sure what you're referring to, with SOF_..._DMA. The DMA, if presented as a PHC as I am proposing, would play the role of the hardware timestamp provider (think SOF_TIMESTAMPING_TX_HARDWARE | SOF_TIMESTAMPING_RX_HARDWARE), so there will be no driver-visible special socket option flags for DMA timestamping.
On Fri, 13 Oct 2023 20:09:03 +0300 Vladimir Oltean wrote: > > > Translation between the UAPI-visible PHC index and MAC, DMA, phylib > > > PHY, other PHY etc can then be done by the kernel as needed. > > > > Translation by the kernel at which point? > > The gist of what I'm proposing is for the core ethtool netlink message > handler to get just the phc_index as an attribute. No other information > as to what it represents. Not that it's netdev, DMA, phylib PHY or whatnot. > > The ethtool kernel code would iterate through the stuff registered in > the system for the netdev, calling get_ts_info() or phy_ts_info() on it, > until it finds something which populates struct ethtool_ts_info :: > phc_index with the phc_index retrieved from netlink. > > Then, ethtool just talks with the timestamper that matched that phc_index. > > Same idea would be applied for the command that lists all timestamping > layers for a netdev. Check get_ts_info(), phy_ts_info(dev->phydev), and > can be extended in the future. I see, that could work. The user would then dig around sysfs to figure out which PHC has what characteristics? > > IMHO it'd indeed be clearer for the user to have an ability to read > > the PHC for SOF_..._DMA via ETHTOOL_MSG_TS_LIST_GET_REPLY as a separate > > entry, rather than e.g. assume that DMA uses the same PHC as MAC. > > I'm not really sure what you're referring to, with SOF_..._DMA. > The DMA, if presented as a PHC as I am proposing, would play the role of > the hardware timestamp provider (think SOF_TIMESTAMPING_TX_HARDWARE | > SOF_TIMESTAMPING_RX_HARDWARE), so there will be no driver-visible > special socket option flags for DMA timestamping. Each packet may want different timestamp tho, especially on Tx it should be fairly easy for socket to request to get "real" MAC stamps, while most get cheaper DMA stamps. Currently some drivers run flow matching to find PTP packets and automatically give them better quality timestamps :( Even if at the config level we use PHCs we need to translate that into some SKBTX_* bit, don't we?
On Fri, Oct 13, 2023 at 10:46:06AM -0700, Jakub Kicinski wrote: > On Fri, 13 Oct 2023 20:09:03 +0300 Vladimir Oltean wrote: > > > > Translation between the UAPI-visible PHC index and MAC, DMA, phylib > > > > PHY, other PHY etc can then be done by the kernel as needed. > > > > > > Translation by the kernel at which point? > > > > The gist of what I'm proposing is for the core ethtool netlink message > > handler to get just the phc_index as an attribute. No other information > > as to what it represents. Not that it's netdev, DMA, phylib PHY or whatnot. > > > > The ethtool kernel code would iterate through the stuff registered in > > the system for the netdev, calling get_ts_info() or phy_ts_info() on it, > > until it finds something which populates struct ethtool_ts_info :: > > phc_index with the phc_index retrieved from netlink. > > > > Then, ethtool just talks with the timestamper that matched that phc_index. > > > > Same idea would be applied for the command that lists all timestamping > > layers for a netdev. Check get_ts_info(), phy_ts_info(dev->phydev), and > > can be extended in the future. > > I see, that could work. The user would then dig around sysfs to figure > out which PHC has what characteristics? Yup. /sys/class/ptp/ptp<N>/ gives you everything else you need to know about the PHC index that's configured as the active timestamper for this netdev. It's just that (and I need to stress this again) the timestamping-capable DMA blocks that you're talking about, but I've never seen, should be able to fully implement a ptp_clock, with their own clock operations and friends. > > > IMHO it'd indeed be clearer for the user to have an ability to read > > > the PHC for SOF_..._DMA via ETHTOOL_MSG_TS_LIST_GET_REPLY as a separate > > > entry, rather than e.g. assume that DMA uses the same PHC as MAC. > > > > I'm not really sure what you're referring to, with SOF_..._DMA. > > The DMA, if presented as a PHC as I am proposing, would play the role of > > the hardware timestamp provider (think SOF_TIMESTAMPING_TX_HARDWARE | > > SOF_TIMESTAMPING_RX_HARDWARE), so there will be no driver-visible > > special socket option flags for DMA timestamping. > > Each packet may want different timestamp tho, especially on Tx it > should be fairly easy for socket to request to get "real" MAC stamps, > while most get cheaper DMA stamps. Currently some drivers run flow > matching to find PTP packets and automatically give them better quality > timestamps :( > > Even if at the config level we use PHCs we need to translate that into > some SKBTX_* bit, don't we? I think Richard had something to say about that being wishful thinking: https://lore.kernel.org/netdev/ZGw46hrpiqCVNeXS@hoboy.vegasvil.org/ On RX I'm not sure how you'd know in advance if the packet is going to be routed to a socket that wants DMA or MAC timestamps. And having a socket with hardware timestamps from one provider in one direction, and another provider in the other direction, is.... not sane as a kernel API?
On Fri, 13 Oct 2023 20:56:01 +0300 Vladimir Oltean wrote: > > > I'm not really sure what you're referring to, with SOF_..._DMA. > > > The DMA, if presented as a PHC as I am proposing, would play the role of > > > the hardware timestamp provider (think SOF_TIMESTAMPING_TX_HARDWARE | > > > SOF_TIMESTAMPING_RX_HARDWARE), so there will be no driver-visible > > > special socket option flags for DMA timestamping. > > > > Each packet may want different timestamp tho, especially on Tx it > > should be fairly easy for socket to request to get "real" MAC stamps, > > while most get cheaper DMA stamps. Currently some drivers run flow > > matching to find PTP packets and automatically give them better quality > > timestamps :( > > > > Even if at the config level we use PHCs we need to translate that into > > some SKBTX_* bit, don't we? > > I think Richard had something to say about that being wishful thinking: > https://lore.kernel.org/netdev/ZGw46hrpiqCVNeXS@hoboy.vegasvil.org/
On Fri, 13 Oct 2023 18:11:19 +0200 Andrew Lunn <andrew@lunn.ch> wrote: > > > All these possibles timestamps go through exclusively the netdev API or > > > the phylib API. Even the software timestamping is done in the netdev > > > driver, therefore it goes through the netdev API and then should have the > > > NETDEV_TIMESTAMPING bit set. > > > > Netdev vs phylib is an implementation detail of Linux. > > I'm also surprised that you changed this. > > > > > > > + */ > > > > > +enum { > > > > > + NO_TIMESTAMPING = 0, > > > > > + NETDEV_TIMESTAMPING = (1 << 0), > > > > > + PHYLIB_TIMESTAMPING = (1 << 1), > > > > > + SOFTWARE_TIMESTAMPING = (1 << 2) | (1 << 0), > > Just emphasising Jakubs point here. phylib is an implementation > detail, in that the MAC driver might be using firmware to drive its > PHY, and that firmware can do a timestamp in the PHY. The API being > defined here should be independent of the implementation details. So > it probably should be MAC_TIMESTAMPING and PHY_TIMESTAMPING, and leave > it to the driver to decide if its PHYLIB doing the actual work, or > firmware. That is one reason why I moved to NETDEV_TIMESTAMPING, we don't know if it will really be the MAC that does the timestamping, as the firmware could ask the PHY to does it, but it surely goes though the netdev driver. > Netdev vs phylib is an implementation detail of Linux. > I'm also surprised that you changed this. This is the main reason I changed this. This is Linux implementation purpose to know whether it should go through netdev or phylib, and then each of these drivers could use other timestamps which are hardware related. As I have answered to Florian maybe you prefer to separate the Linux implementation detail and the hardware timestamping like this: > Or maybe do you prefer to use defines like this: > # define NETDEV_TIMESTAMPING (1 << 0) > # define PHYLIB_TIMESTAMPING (1 << 1) > > enum { > NO_TIMESTAMPING = 0, > MAC_TIMESTAMPING = NETDEV_TIMESTAMPING, > PHY_TIMESTAMPING = PHYLIB_TIMESTAMPING, > SOFTWARE_TIMESTAMPING = (1 << 2) | NETDEV_TIMESTAMPING, > ... > MAC_DMA_TIMESTAMPING = (2 << 2) | NETDEV_TIMESTAMPING, > MAC_PRECISION_TIMESTAMPING = (3 << 2) | NETDEV_TIMESTAMPING, > }; > > The gist of what I'm proposing is for the core ethtool netlink message > > handler to get just the phc_index as an attribute. No other information > > as to what it represents. Not that it's netdev, DMA, phylib PHY or whatnot. > > > > The ethtool kernel code would iterate through the stuff registered in > > the system for the netdev, calling get_ts_info() or phy_ts_info() on it, > > until it finds something which populates struct ethtool_ts_info :: > > phc_index with the phc_index retrieved from netlink. > > > > Then, ethtool just talks with the timestamper that matched that phc_index. > > > > Same idea would be applied for the command that lists all timestamping > > layers for a netdev. Check get_ts_info(), phy_ts_info(dev->phydev), and > > can be extended in the future. > > I see, that could work. The user would then dig around sysfs to figure > out which PHC has what characteristics? I am not an expert but there are net drivers that enable SOF_TIMESTAMPING_TX/RX/RAW_HARDWARE without phc. In that case we won't ever be able to enter the get_ts_info() with you proposition. Still I am wondering why hardware timestamping capabilities can be enabled without phc.
On Mon, 16 Oct 2023 12:41:34 +0200 Köry Maincent wrote: > > Netdev vs phylib is an implementation detail of Linux. > > I'm also surprised that you changed this. > > This is the main reason I changed this. This is Linux implementation purpose to > know whether it should go through netdev or phylib, and then each of these > drivers could use other timestamps which are hardware related. For an integrated design there's 90% chance the stamping is done by the MAC. Even if it isn't there's no difference between PHY and MAC in terms of quality. But there is a big difference between MAC/PHY and DMA which would both fall under NETDEV?
On Mon, 16 Oct 2023 07:22:04 -0700 Jakub Kicinski <kuba@kernel.org> wrote: > On Mon, 16 Oct 2023 12:41:34 +0200 Köry Maincent wrote: > > > Netdev vs phylib is an implementation detail of Linux. > > > I'm also surprised that you changed this. > > > > This is the main reason I changed this. This is Linux implementation > > purpose to know whether it should go through netdev or phylib, and then > > each of these drivers could use other timestamps which are hardware > > related. > > For an integrated design there's 90% chance the stamping is done > by the MAC. Even if it isn't there's no difference between PHY > and MAC in terms of quality. Ok, but there might be quality difference in case of several timestamp configuration done in the MAC. Like the timestamping precision vs frequency precision. In that case how ethtool would tell the driver to switch between them? My solution could work for this case by simply adding new values to the enum: enum { NETDEV_TIMESTAMPING = (1 << 0), PHYLIB_TIMESTAMPING = (1 << 1), MAC_TS_PRECISION = (1 << 2)|(1 << 0), MAC_FREQ_PRECISION = (2 << 2)|(1 << 0), } Automatically Linux will go through the netdev implementation and could pass the enum value to the netdev driver. > But there is a big difference between MAC/PHY and DMA which would > both fall under NETDEV? Currently there is no DMA timestamping support right? And I suppose it fill fall under the net device management? In that case we will have MAC and DMA under netdev and PHY under phylib and we won't have to do anything more than this timestamping management patch: https://lore.kernel.org/netdev/20231009155138.86458-14-kory.maincent@bootlin.com/
On Mon, 16 Oct 2023 17:00:27 +0200 Köry Maincent wrote: > On Mon, 16 Oct 2023 07:22:04 -0700 > Jakub Kicinski <kuba@kernel.org> wrote: > > > This is the main reason I changed this. This is Linux implementation > > > purpose to know whether it should go through netdev or phylib, and then > > > each of these drivers could use other timestamps which are hardware > > > related. > > > > For an integrated design there's 90% chance the stamping is done > > by the MAC. Even if it isn't there's no difference between PHY > > and MAC in terms of quality. > > Ok, but there might be quality difference in case of several timestamp > configuration done in the MAC. Like the timestamping precision vs frequency > precision. In that case how ethtool would tell the driver to switch between > them? What's the reason for timestamp precision differences? My understanding so far was the the differences come from: 1. different stamping device (i.e. separate "piece of silicon", accessed over a different bus, with different PHC etc.) 2. different stamping point (MAC vs DMA) I don't think any "integrated" device would support stamps which differ under category 1. > My solution could work for this case by simply adding new values to the enum: > > enum { > NETDEV_TIMESTAMPING = (1 << 0), > PHYLIB_TIMESTAMPING = (1 << 1), > MAC_TS_PRECISION = (1 << 2)|(1 << 0), > MAC_FREQ_PRECISION = (2 << 2)|(1 << 0), > } > > Automatically Linux will go through the netdev implementation and could pass > the enum value to the netdev driver. We can add multiple fields to netlink. Why use the magic encoding? > > But there is a big difference between MAC/PHY and DMA which would > > both fall under NETDEV? > > Currently there is no DMA timestamping support right? Kinda. Some devices pass DMA stamps as "HW stamps", and pretend they are "good enough". But yes, there's no distinction at API level. > And I suppose it fill fall under the net device management? Yes. > In that case we will have MAC and DMA under netdev and PHY under phylib and > we won't have to do anything more than this timestamping management patch: > https://lore.kernel.org/netdev/20231009155138.86458-14-kory.maincent@bootlin.com/ Maybe we should start with a doc describing what APIs are at play, what questions they answer, and what hard use cases we have. I'm not opposed to the ethool API reporting just the differences from my point 1. (in the first paragraph). But then we shouldn't call that "layer", IMO, but device or source or such.
On Mon, 16 Oct 2023 08:43:46 -0700 Jakub Kicinski <kuba@kernel.org> wrote: > On Mon, 16 Oct 2023 17:00:27 +0200 Köry Maincent wrote: > > On Mon, 16 Oct 2023 07:22:04 -0700 > > Jakub Kicinski <kuba@kernel.org> wrote: > > > Ok, but there might be quality difference in case of several timestamp > > configuration done in the MAC. Like the timestamping precision vs frequency > > precision. In that case how ethtool would tell the driver to switch between > > them? > > What's the reason for timestamp precision differences? > My understanding so far was the the differences come from: > 1. different stamping device (i.e. separate "piece of silicon", > accessed over a different bus, with different PHC etc.) > 2. different stamping point (MAC vs DMA) > > I don't think any "integrated" device would support stamps which > differ under category 1. It was a case reported by Maxime on v3: https://lore.kernel.org/netdev/20230324112541.0b3dd38a@pc-7.home/ > > My solution could work for this case by simply adding new values to the > > enum: > > > > enum { > > NETDEV_TIMESTAMPING = (1 << 0), > > PHYLIB_TIMESTAMPING = (1 << 1), > > MAC_TS_PRECISION = (1 << 2)|(1 << 0), > > MAC_FREQ_PRECISION = (2 << 2)|(1 << 0), > > } > > > > Automatically Linux will go through the netdev implementation and could pass > > the enum value to the netdev driver. > > We can add multiple fields to netlink. Why use the magic encoding? To simplify the Linux code to go under either netdev or phylib implementation without needing describing all the enum possibility in the condition: if (ts_layer & PHYLIB_TIMESTAMPING) ... if (ts_layer & NETDEV_TIMESTAMPING) ... We also could add "is_phylib" and "is_netdev" functions with a simple switch case in it, but we have to be careful to always update these functions when new enum values will appear. > > > > But there is a big difference between MAC/PHY and DMA which would > > > both fall under NETDEV? > > > > Currently there is no DMA timestamping support right? > > Kinda. Some devices pass DMA stamps as "HW stamps", and pretend they > are "good enough". But yes, there's no distinction at API level. Ok. I did suppose this when writing my last reply. > > In that case we will have MAC and DMA under netdev and PHY under phylib and > > we won't have to do anything more than this timestamping management patch: > > https://lore.kernel.org/netdev/20231009155138.86458-14-kory.maincent@bootlin.com/ > > > > Maybe we should start with a doc describing what APIs are at play, > what questions they answer, and what hard use cases we have. > > I'm not opposed to the ethool API reporting just the differences > from my point 1. (in the first paragraph). But then we shouldn't > call that "layer", IMO, but device or source or such. I am open to change the naming to fit the best for our current and future usage. If we take into account the Maxime case of several timestamps on a device then maybe source could work.
On Mon, 16 Oct 2023 18:23:07 +0200 Köry Maincent wrote: > > What's the reason for timestamp precision differences? > > My understanding so far was the the differences come from: > > 1. different stamping device (i.e. separate "piece of silicon", > > accessed over a different bus, with different PHC etc.) > > 2. different stamping point (MAC vs DMA) > > > > I don't think any "integrated" device would support stamps which > > differ under category 1. > > It was a case reported by Maxime on v3: > https://lore.kernel.org/netdev/20230324112541.0b3dd38a@pc-7.home/ IMHO this talks about how clock control/disciplining works which is a somewhat independent topic of timestamping.
On 10/16/2023 10:03 AM, Jakub Kicinski wrote: > On Mon, 16 Oct 2023 18:23:07 +0200 Köry Maincent wrote: >>> What's the reason for timestamp precision differences? >>> My understanding so far was the the differences come from: >>> 1. different stamping device (i.e. separate "piece of silicon", >>> accessed over a different bus, with different PHC etc.) >>> 2. different stamping point (MAC vs DMA) >>> >>> I don't think any "integrated" device would support stamps which >>> differ under category 1. >> >> It was a case reported by Maxime on v3: >> https://lore.kernel.org/netdev/20230324112541.0b3dd38a@pc-7.home/ > > IMHO this talks about how clock control/disciplining works which > is a somewhat independent topic of timestamping. The thread in question mentions that the device has two modes, one which has higher precision for the timestamps, and one which has better precision on frequency adjustments. I don't know the details for why the hardware has this behavior, but being able to switch between the two timestamp modes has value as described by the thread. I'm not sure how to represent that in such an API because both modes seem to capture the timestamp at the MAC.
On Mon, Oct 16, 2023 at 12:41:34PM +0200, Köry Maincent wrote: > Still I am wondering why hardware timestamping capabilities can be enabled > without phc. There is hardware that simply provides time values on frames from a free running clock, and that clock cannot be read, set, or adjusted. So the time stamps only relate to other time stamps from the same device. That might be used for performance analysis. Thanks, Richard
On Mon, 16 Oct 2023 16:50:01 -0700 Richard Cochran <richardcochran@gmail.com> wrote: > On Mon, Oct 16, 2023 at 12:41:34PM +0200, Köry Maincent wrote: > > > Still I am wondering why hardware timestamping capabilities can be enabled > > without phc. > > There is hardware that simply provides time values on frames from a > free running clock, and that clock cannot be read, set, or adjusted. > > So the time stamps only relate to other time stamps from the same > device. That might be used for performance analysis. Ok, thanks you for the information. Köry
On Mon, 16 Oct 2023 16:03:13 -0700 Jacob Keller <jacob.e.keller@intel.com> wrote: > On 10/16/2023 10:03 AM, Jakub Kicinski wrote: > > On Mon, 16 Oct 2023 18:23:07 +0200 Köry Maincent wrote: > >>> What's the reason for timestamp precision differences? > >>> My understanding so far was the the differences come from: > >>> 1. different stamping device (i.e. separate "piece of silicon", > >>> accessed over a different bus, with different PHC etc.) > >>> 2. different stamping point (MAC vs DMA) > >>> > >>> I don't think any "integrated" device would support stamps which > >>> differ under category 1. > >> > >> It was a case reported by Maxime on v3: > >> https://lore.kernel.org/netdev/20230324112541.0b3dd38a@pc-7.home/ > > > > IMHO this talks about how clock control/disciplining works which > > is a somewhat independent topic of timestamping. > > The thread in question mentions that the device has two modes, one which > has higher precision for the timestamps, and one which has better > precision on frequency adjustments. I don't know the details for why the > hardware has this behavior, but being able to switch between the two > timestamp modes has value as described by the thread. > > I'm not sure how to represent that in such an API because both modes > seem to capture the timestamp at the MAC. After some thought, indeed moving back to MAC/PHY_TIMESTAMPING seems better. This case of several timestamp modes in MAC is currently only for the special stmmac case. We could support it the same way we could support multiPHY by saving the source id of the timestamp like this in net_device: struct { enum ts_layer layer, int source_id, } ts
diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst index 2540c70952ff..644b3b764044 100644 --- a/Documentation/networking/ethtool-netlink.rst +++ b/Documentation/networking/ethtool-netlink.rst @@ -225,6 +225,7 @@ Userspace to kernel: ``ETHTOOL_MSG_RSS_GET`` get RSS settings ``ETHTOOL_MSG_MM_GET`` get MAC merge layer state ``ETHTOOL_MSG_MM_SET`` set MAC merge layer parameters + ``ETHTOOL_MSG_TS_GET`` get current timestamping ===================================== ================================= Kernel to userspace: @@ -268,6 +269,7 @@ Kernel to userspace: ``ETHTOOL_MSG_PSE_GET_REPLY`` PSE parameters ``ETHTOOL_MSG_RSS_GET_REPLY`` RSS settings ``ETHTOOL_MSG_MM_GET_REPLY`` MAC merge layer status + ``ETHTOOL_MSG_TS_GET_REPLY`` current timestamping ======================================== ================================= ``GET`` requests are sent by userspace applications to retrieve device @@ -1994,6 +1996,26 @@ The attributes are propagated to the driver through the following structure: .. kernel-doc:: include/linux/ethtool.h :identifiers: ethtool_mm_cfg +TS_GET +====== + +Gets current timestamping. + +Request contents: + + ================================= ====== ==================== + ``ETHTOOL_A_TS_HEADER`` nested request header + ================================= ====== ==================== + +Kernel response contents: + + ======================= ====== ============================== + ``ETHTOOL_A_TS_HEADER`` nested reply header + ``ETHTOOL_A_TS_LAYER`` u32 current timestamping + ======================= ====== ============================== + +This command get the current timestamp layer. + Request translation =================== @@ -2100,4 +2122,5 @@ are netlink only. n/a ``ETHTOOL_MSG_PLCA_GET_STATUS`` n/a ``ETHTOOL_MSG_MM_GET`` n/a ``ETHTOOL_MSG_MM_SET`` + n/a ``ETHTOOL_MSG_TS_GET`` =================================== ===================================== diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index 73e2c10dc2cc..cb51136328cf 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -57,6 +57,7 @@ enum { ETHTOOL_MSG_PLCA_GET_STATUS, ETHTOOL_MSG_MM_GET, ETHTOOL_MSG_MM_SET, + ETHTOOL_MSG_TS_GET, /* add new constants above here */ __ETHTOOL_MSG_USER_CNT, @@ -109,6 +110,7 @@ enum { ETHTOOL_MSG_PLCA_NTF, ETHTOOL_MSG_MM_GET_REPLY, ETHTOOL_MSG_MM_NTF, + ETHTOOL_MSG_TS_GET_REPLY, /* add new constants above here */ __ETHTOOL_MSG_KERNEL_CNT, @@ -975,6 +977,18 @@ enum { ETHTOOL_A_MM_MAX = (__ETHTOOL_A_MM_CNT - 1) }; +/* TS LAYER */ + +enum { + ETHTOOL_A_TS_UNSPEC, + ETHTOOL_A_TS_HEADER, /* nest - _A_HEADER_* */ + ETHTOOL_A_TS_LAYER, /* u32 */ + + /* add new constants above here */ + __ETHTOOL_A_TS_CNT, + ETHTOOL_A_TS_MAX = (__ETHTOOL_A_TS_CNT - 1) +}; + /* generic netlink info */ #define ETHTOOL_GENL_NAME "ethtool" #define ETHTOOL_GENL_VERSION 1 diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h index df8091998c8d..33ff8e989dbe 100644 --- a/include/uapi/linux/net_tstamp.h +++ b/include/uapi/linux/net_tstamp.h @@ -13,6 +13,20 @@ #include <linux/types.h> #include <linux/socket.h> /* for SO_TIMESTAMPING */ +/* + * Hardware layer of the TIMESTAMPING provider + * New description layer should have the NETDEV_TIMESTAMPING or + * PHYLIB_TIMESTAMPING bit set to know which API to use for timestamping. + */ +enum { + NO_TIMESTAMPING = 0, + NETDEV_TIMESTAMPING = (1 << 0), + PHYLIB_TIMESTAMPING = (1 << 1), + SOFTWARE_TIMESTAMPING = (1 << 2) | (1 << 0), + + __TIMESTAMPING_COUNT, +}; + /* SO_TIMESTAMPING flags */ enum { SOF_TIMESTAMPING_TX_HARDWARE = (1<<0), diff --git a/net/ethtool/Makefile b/net/ethtool/Makefile index 504f954a1b28..4ea64c080639 100644 --- a/net/ethtool/Makefile +++ b/net/ethtool/Makefile @@ -8,4 +8,4 @@ ethtool_nl-y := netlink.o bitset.o strset.o linkinfo.o linkmodes.o rss.o \ linkstate.o debug.o wol.o features.o privflags.o rings.o \ channels.o coalesce.o pause.o eee.o tsinfo.o cabletest.o \ tunnels.o fec.o eeprom.o stats.o phc_vclocks.o mm.o \ - module.o pse-pd.o plca.o mm.o + module.o pse-pd.o plca.o mm.o ts.o diff --git a/net/ethtool/common.h b/net/ethtool/common.h index 28b8aaaf9bcb..a264b635f7d3 100644 --- a/net/ethtool/common.h +++ b/net/ethtool/common.h @@ -35,6 +35,7 @@ extern const char wol_mode_names[][ETH_GSTRING_LEN]; extern const char sof_timestamping_names[][ETH_GSTRING_LEN]; extern const char ts_tx_type_names[][ETH_GSTRING_LEN]; extern const char ts_rx_filter_names[][ETH_GSTRING_LEN]; +extern const char ts_layer_names[][ETH_GSTRING_LEN]; extern const char udp_tunnel_type_names[][ETH_GSTRING_LEN]; int __ethtool_get_link(struct net_device *dev); diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c index 3bbd5afb7b31..561c0931d055 100644 --- a/net/ethtool/netlink.c +++ b/net/ethtool/netlink.c @@ -306,6 +306,7 @@ ethnl_default_requests[__ETHTOOL_MSG_USER_CNT] = { [ETHTOOL_MSG_PLCA_GET_STATUS] = ðnl_plca_status_request_ops, [ETHTOOL_MSG_MM_GET] = ðnl_mm_request_ops, [ETHTOOL_MSG_MM_SET] = ðnl_mm_request_ops, + [ETHTOOL_MSG_TS_GET] = ðnl_ts_request_ops, }; static struct ethnl_dump_ctx *ethnl_dump_context(struct netlink_callback *cb) @@ -1128,6 +1129,15 @@ static const struct genl_ops ethtool_genl_ops[] = { .policy = ethnl_mm_set_policy, .maxattr = ARRAY_SIZE(ethnl_mm_set_policy) - 1, }, + { + .cmd = ETHTOOL_MSG_TS_GET, + .doit = ethnl_default_doit, + .start = ethnl_default_start, + .dumpit = ethnl_default_dumpit, + .done = ethnl_default_done, + .policy = ethnl_ts_get_policy, + .maxattr = ARRAY_SIZE(ethnl_ts_get_policy) - 1, + }, }; static const struct genl_multicast_group ethtool_nl_mcgrps[] = { diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h index 9a333a8d04c1..1e6085198acc 100644 --- a/net/ethtool/netlink.h +++ b/net/ethtool/netlink.h @@ -395,6 +395,7 @@ extern const struct ethnl_request_ops ethnl_rss_request_ops; extern const struct ethnl_request_ops ethnl_plca_cfg_request_ops; extern const struct ethnl_request_ops ethnl_plca_status_request_ops; extern const struct ethnl_request_ops ethnl_mm_request_ops; +extern const struct ethnl_request_ops ethnl_ts_request_ops; extern const struct nla_policy ethnl_header_policy[ETHTOOL_A_HEADER_FLAGS + 1]; extern const struct nla_policy ethnl_header_policy_stats[ETHTOOL_A_HEADER_FLAGS + 1]; @@ -441,6 +442,7 @@ extern const struct nla_policy ethnl_plca_set_cfg_policy[ETHTOOL_A_PLCA_MAX + 1] extern const struct nla_policy ethnl_plca_get_status_policy[ETHTOOL_A_PLCA_HEADER + 1]; extern const struct nla_policy ethnl_mm_get_policy[ETHTOOL_A_MM_HEADER + 1]; extern const struct nla_policy ethnl_mm_set_policy[ETHTOOL_A_MM_MAX + 1]; +extern const struct nla_policy ethnl_ts_get_policy[ETHTOOL_A_TS_HEADER + 1]; int ethnl_set_features(struct sk_buff *skb, struct genl_info *info); int ethnl_act_cable_test(struct sk_buff *skb, struct genl_info *info); diff --git a/net/ethtool/ts.c b/net/ethtool/ts.c new file mode 100644 index 000000000000..cd33f057ee48 --- /dev/null +++ b/net/ethtool/ts.c @@ -0,0 +1,78 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include <linux/net_tstamp.h> +#include <linux/phy.h> + +#include "netlink.h" +#include "common.h" +#include "bitset.h" + +struct ts_req_info { + struct ethnl_req_info base; +}; + +struct ts_reply_data { + struct ethnl_reply_data base; + u32 ts_layer; +}; + +#define TS_REPDATA(__reply_base) \ + container_of(__reply_base, struct ts_reply_data, base) + +/* TS_GET */ +const struct nla_policy ethnl_ts_get_policy[] = { + [ETHTOOL_A_TS_HEADER] = + NLA_POLICY_NESTED(ethnl_header_policy), +}; + +static int ts_prepare_data(const struct ethnl_req_info *req_base, + struct ethnl_reply_data *reply_base, + const struct genl_info *info) +{ + struct ts_reply_data *data = TS_REPDATA(reply_base); + struct net_device *dev = reply_base->dev; + const struct ethtool_ops *ops = dev->ethtool_ops; + int ret; + + ret = ethnl_ops_begin(dev); + if (ret < 0) + return ret; + + if (phy_has_tsinfo(dev->phydev)) + data->ts_layer = PHYLIB_TIMESTAMPING; + else if (ops->get_ts_info) + data->ts_layer = NETDEV_TIMESTAMPING; + else + data->ts_layer = NO_TIMESTAMPING; + + ethnl_ops_complete(dev); + + return ret; +} + +static int ts_reply_size(const struct ethnl_req_info *req_base, + const struct ethnl_reply_data *reply_base) +{ + return nla_total_size(sizeof(u32)); +} + +static int ts_fill_reply(struct sk_buff *skb, + const struct ethnl_req_info *req_base, + const struct ethnl_reply_data *reply_base) +{ + struct ts_reply_data *data = TS_REPDATA(reply_base); + + return nla_put_u32(skb, ETHTOOL_A_TS_LAYER, data->ts_layer); +} + +const struct ethnl_request_ops ethnl_ts_request_ops = { + .request_cmd = ETHTOOL_MSG_TS_GET, + .reply_cmd = ETHTOOL_MSG_TS_GET_REPLY, + .hdr_attr = ETHTOOL_A_TS_HEADER, + .req_info_size = sizeof(struct ts_req_info), + .reply_data_size = sizeof(struct ts_reply_data), + + .prepare_data = ts_prepare_data, + .reply_size = ts_reply_size, + .fill_reply = ts_fill_reply, +};