Message ID | 20240828-e810-live-migration-jk-prep-ctx-functions-v2-10-558ab9e240f5@intel.com (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | ice: use <linux/packing.h> for Tx and Rx queue context data | expand |
On Wed, Aug 28, 2024 at 01:57:26PM -0700, Jacob Keller wrote: > The major difference with <linux/packing.h> is that it expects the unpacked > data will always be a u64. This is somewhat limiting, but can be worked > around by using a local temporary u64. > > As with the other implementations, we handle the error codes from pack() > with a pr_err and a call to dump_stack. These are unexpected as they should > only happen due to programmer error. > > Note that I initially tried implementing this as functions which just > repeatably called the ice_ctx_pack() function instead of using the > ice_rlan_ctx_info and ice_tlan_ctx_info arrays. This does work, but it has > a couple of downsides: > > 1) it wastes a significant amount of bytes in the text section, vs the > savings from removing the RO data of the arrays. > > 2) this cost is made worse after implementing an unpack function, as we > must duplicate the list of packings for the unpack function. I agree with your concerns and with your decision of keeping the ICE_CTX_STORE tables. Actually I have some more unposted lib/packing changes which operate on struct packed_field arrays, very, very similar to the ICE_CTX_STORE tables. Essentially two new calls: pack_fields() and unpack_fields(), which perform the iteration inside the core library. (the only real difference being that I went for startbit and endbit in their definitions, rather than LSB+size). I came to the realization that this API would be nice exactly because otherwise, you need to duplicate the field definitions, once for the pack() call and once for the unpack(). But if they're tables, you don't. I'm looking at ways in which this new API could solve all the shortcomings I don't like with lib/packing in general. Without being even aware of ICE_CTX_STORE, I had actually implemented the type conversion to smaller unpacked u8/u16/u32 types in exactly the same way. I also wish to do something about the way in which lib/packing deals with errors. I don't think it's exactly great for every driver to have to dump stack and ignore them. And also, since they're programming errors, it's odd (and bad for performance) to perform the sanity checks for every function call, instead of just once, say at boot time. So I had thought of a third new API call: packed_fields_validate(), which runs at module_init() in the consumer driver (ice, sja1105, ...) and operates on the field tables. Basically it is a singular replacement for these existing verifications in pack() and unpack(): /* startbit is expected to be larger than endbit, and both are * expected to be within the logically addressable range of the buffer. */ if (unlikely(startbit < endbit || startbit >= 8 * pbuflen)) /* Invalid function call */ return -EINVAL; value_width = startbit - endbit + 1; if (unlikely(value_width > 64)) return -ERANGE; so you actually need to tell packed_fields_validate() what is the size of the buffer you intend to run pack_fields() and unpack_fields() on. I don't see it as a problem at all that the packed buffer size must be fixed and known ahead of time - you also operate on fixed ICE_RXQ_CTX_SZ and ... hmmm... txq_ctx[22]? sized buffers. But this packed_fields_validate() idea quickly creates another set of 2 problems which I didn't get to the bottom of: 1. Some sanity checks in pack() are dynamic and cannot be run at module_init() time. Like this: /* Check if "uval" fits in "value_width" bits. * If value_width is 64, the check will fail, but any * 64-bit uval will surely fit. */ if (value_width < 64 && uval >= (1ull << value_width)) /* Cannot store "uval" inside "value_width" bits. * Truncating "uval" is most certainly not desirable, * so simply erroring out is appropriate. */ return -ERANGE; The worse part is that the check is actually useful. It led to the quick identification of the bug behind commit 24deec6b9e4a ("net: dsa: sja1105: disallow C45 transactions on the BASE-TX MDIO bus"), rather than seeing a silent failure. But... I would still really like the revised lib/packing API to return void for pack_fields() and unpack_fields(). Not really sure how to reconcile these. 2. How to make sure that the pbuflen passed to packed_fields_validate() will be the same one as the pbuflen passed to all subsequent pack_fields() calls validated against that very pbuflen? Sorry for not posting code and just telling a story about it. There isn't much to show other than unfinished ideas with conflicting requirements. So I thought maybe I could use a little help with some brainstorming. Of course, do let me know if you are not that interested in making the ICE_CTX_STORE tables idea a part of the packing core. Thanks!
On 9/2/2024 5:08 PM, Vladimir Oltean wrote: > On Wed, Aug 28, 2024 at 01:57:26PM -0700, Jacob Keller wrote: >> The major difference with <linux/packing.h> is that it expects the unpacked >> data will always be a u64. This is somewhat limiting, but can be worked >> around by using a local temporary u64. >> >> As with the other implementations, we handle the error codes from pack() >> with a pr_err and a call to dump_stack. These are unexpected as they should >> only happen due to programmer error. >> >> Note that I initially tried implementing this as functions which just >> repeatably called the ice_ctx_pack() function instead of using the >> ice_rlan_ctx_info and ice_tlan_ctx_info arrays. This does work, but it has >> a couple of downsides: >> >> 1) it wastes a significant amount of bytes in the text section, vs the >> savings from removing the RO data of the arrays. >> >> 2) this cost is made worse after implementing an unpack function, as we >> must duplicate the list of packings for the unpack function. > > I agree with your concerns and with your decision of keeping the > ICE_CTX_STORE tables. Actually I have some more unposted lib/packing > changes which operate on struct packed_field arrays, very, very similar > to the ICE_CTX_STORE tables. Essentially two new calls: pack_fields() > and unpack_fields(), which perform the iteration inside the core library. > (the only real difference being that I went for startbit and endbit in > their definitions, rather than LSB+size). > I only kept my interface in terms of lsb+size because I did not want to attempt to re-write the table. I actually did re-write the table at first, and discovered a documentation bug because our documentation for the table has incorrect lsb/msb for some of the fields in some versions of the doc! I ultimately don't mind switching to the packing convention of start/end (though my brain does have trouble sometimes thinking of the start as the higher bit...) > I came to the realization that this API would be nice exactly because > otherwise, you need to duplicate the field definitions, once for the > pack() call and once for the unpack(). But if they're tables, you don't. > > I'm looking at ways in which this new API could solve all the shortcomings > I don't like with lib/packing in general. Without being even aware of > ICE_CTX_STORE, I had actually implemented the type conversion to smaller > unpacked u8/u16/u32 types in exactly the same way. I think having this in the core API with a standardized table, along with support for unpacking the types would be great! > > I also wish to do something about the way in which lib/packing deals > with errors. I don't think it's exactly great for every driver to have > to dump stack and ignore them. And also, since they're programming > errors, it's odd (and bad for performance) to perform the sanity checks > for every function call, instead of just once, say at boot time. So I > had thought of a third new API call: packed_fields_validate(), which > runs at module_init() in the consumer driver (ice, sja1105, ...) and > operates on the field tables. > It does seem reasonable to me that we can move the sanity checks here, especially since the main programmer error is if this table is incorrect in one of these ways. > Basically it is a singular replacement for these existing verifications > in pack() and unpack(): > > /* startbit is expected to be larger than endbit, and both are > * expected to be within the logically addressable range of the buffer. > */ > if (unlikely(startbit < endbit || startbit >= 8 * pbuflen)) > /* Invalid function call */ > return -EINVAL; > > value_width = startbit - endbit + 1; > if (unlikely(value_width > 64)) > return -ERANGE; > > so you actually need to tell packed_fields_validate() what is the size > of the buffer you intend to run pack_fields() and unpack_fields() on. > I don't see it as a problem at all that the packed buffer size must be > fixed and known ahead of time - you also operate on fixed ICE_RXQ_CTX_SZ > and ... hmmm... txq_ctx[22]? sized buffers. > Yea, these are fixed sizes. Strictly I think we could have a macro defining the size of the Tx queue context as well.... > But this packed_fields_validate() idea quickly creates another set of 2 > problems which I didn't get to the bottom of: > > 1. Some sanity checks in pack() are dynamic and cannot be run at > module_init() time. Like this: > > /* Check if "uval" fits in "value_width" bits. > * If value_width is 64, the check will fail, but any > * 64-bit uval will surely fit. > */ > if (value_width < 64 && uval >= (1ull << value_width)) > /* Cannot store "uval" inside "value_width" bits. > * Truncating "uval" is most certainly not desirable, > * so simply erroring out is appropriate. > */ > return -ERANGE; > If we support u8/u16/u32/u64 sizes as well, you could check that the size of the unpacked variable too. Could this data be in the table? Oh I guess technically not because we are checking if the actual value passed fits. I think keeping this but making it WARN would be sufficient? > The worse part is that the check is actually useful. It led to the > quick identification of the bug behind commit 24deec6b9e4a ("net: > dsa: sja1105: disallow C45 transactions on the BASE-TX MDIO bus"), > rather than seeing a silent failure. But... I would still really like > the revised lib/packing API to return void for pack_fields() and > unpack_fields(). Not really sure how to reconcile these. > Since this is generally programmer error (not something where uAPI can affect it) what about making these WARN in the core? > 2. How to make sure that the pbuflen passed to packed_fields_validate() > will be the same one as the pbuflen passed to all subsequent pack_fields() > calls validated against that very pbuflen? > I guess you either duplicate the check and WARN, or you don't, and let it panic on the invalid memory? But I guess that would only actually panic with something like KASAN > Sorry for not posting code and just telling a story about it. There > isn't much to show other than unfinished ideas with conflicting > requirements. So I thought maybe I could use a little help with some > brainstorming. Of course, do let me know if you are not that interested > in making the ICE_CTX_STORE tables idea a part of the packing core. > > Thanks! I think moving this into core would be fantastic. Since pretty much every driver handles these sanity checks the same way, I also think that moving those into the core and making them WARN or similar seems reasonable, so we can make the pack/unpack as void. It would be interesting to see a comparison of the resulting module size. How much of this do you have code for now?
On Tue, Sep 03, 2024 at 02:16:59PM -0700, Jacob Keller wrote: > I only kept my interface in terms of lsb+size because I did not want to > attempt to re-write the table. I actually did re-write the table at > first, and discovered a documentation bug because our documentation for > the table has incorrect lsb/msb for some of the fields in some versions > of the doc! > > I ultimately don't mind switching to the packing convention of start/end > (though my brain does have trouble sometimes thinking of the start as > the higher bit...) Well, you can keep the ICE_CTX_STORE() macro formatted in whichever way you want, including Width+LSB, but reimplement it to generate a struct packed_field, formatted as startbit+endbit (endbit=LSB, startbit=LSB+Width-1). I actually encourage any user of the packing library to write the code in the closest way possible to the field definitions as they appear in the documentation. What do you think? > > I came to the realization that this API would be nice exactly because > > otherwise, you need to duplicate the field definitions, once for the > > pack() call and once for the unpack(). But if they're tables, you don't. > > > > I'm looking at ways in which this new API could solve all the shortcomings > > I don't like with lib/packing in general. Without being even aware of > > ICE_CTX_STORE, I had actually implemented the type conversion to smaller > > unpacked u8/u16/u32 types in exactly the same way. > > I think having this in the core API with a standardized table, along > with support for unpacking the types would be great! Cool then! > > I also wish to do something about the way in which lib/packing deals > > with errors. I don't think it's exactly great for every driver to have > > to dump stack and ignore them. And also, since they're programming > > errors, it's odd (and bad for performance) to perform the sanity checks > > for every function call, instead of just once, say at boot time. So I > > had thought of a third new API call: packed_fields_validate(), which > > runs at module_init() in the consumer driver (ice, sja1105, ...) and > > operates on the field tables. > > > > It does seem reasonable to me that we can move the sanity checks here, > especially since the main programmer error is if this table is incorrect > in one of these ways. Actually I did manage to cook something up which I like even more than packed_fields_validate(): a compile-time sanity check. I'm not completely happy with it, just reasonably happy. You'll see. > > Basically it is a singular replacement for these existing verifications > > in pack() and unpack(): > > > > /* startbit is expected to be larger than endbit, and both are > > * expected to be within the logically addressable range of the buffer. > > */ > > if (unlikely(startbit < endbit || startbit >= 8 * pbuflen)) > > /* Invalid function call */ > > return -EINVAL; > > > > value_width = startbit - endbit + 1; > > if (unlikely(value_width > 64)) > > return -ERANGE; > > > > so you actually need to tell packed_fields_validate() what is the size > > of the buffer you intend to run pack_fields() and unpack_fields() on. > > I don't see it as a problem at all that the packed buffer size must be > > fixed and known ahead of time - you also operate on fixed ICE_RXQ_CTX_SZ > > and ... hmmm... txq_ctx[22]? sized buffers. > > > > Yea, these are fixed sizes. Strictly I think we could have a macro > defining the size of the Tx queue context as well.... Yeah, you'll need than when you'll see what I've prepared below :) > > But this packed_fields_validate() idea quickly creates another set of 2 > > problems which I didn't get to the bottom of: > > > > 1. Some sanity checks in pack() are dynamic and cannot be run at > > module_init() time. Like this: > > > > /* Check if "uval" fits in "value_width" bits. > > * If value_width is 64, the check will fail, but any > > * 64-bit uval will surely fit. > > */ > > if (value_width < 64 && uval >= (1ull << value_width)) > > /* Cannot store "uval" inside "value_width" bits. > > * Truncating "uval" is most certainly not desirable, > > * so simply erroring out is appropriate. > > */ > > return -ERANGE; > > > > If we support u8/u16/u32/u64 sizes as well, you could check that the > size of the unpacked variable too. Could this data be in the table? Oh I > guess technically not because we are checking if the actual value passed > fits. I think keeping this but making it WARN would be sufficient? The u8/u16/u32/u64 unpacked field size will absolutely be held in the struct packed_field tables. > > The worse part is that the check is actually useful. It led to the > > quick identification of the bug behind commit 24deec6b9e4a ("net: > > dsa: sja1105: disallow C45 transactions on the BASE-TX MDIO bus"), > > rather than seeing a silent failure. But... I would still really like > > the revised lib/packing API to return void for pack_fields() and > > unpack_fields(). Not really sure how to reconcile these. > > > > Since this is generally programmer error (not something where uAPI can > affect it) what about making these WARN in the core? Yes, I went for demoting the -ERANGE error in the truncation case to a WARN() that is printed for both the "int pack()" as well as the new "void __pack()" call. Practically, it doesn't make a big difference whether we bail out or do nothing, as long as we loudly complain in either case. It's the complaint that leads to the bug getting easily identified and fixed. > > 2. How to make sure that the pbuflen passed to packed_fields_validate() > > will be the same one as the pbuflen passed to all subsequent pack_fields() > > calls validated against that very pbuflen? > > I guess you either duplicate the check and WARN, or you don't, and let > it panic on the invalid memory? But I guess that would only actually > panic with something like KASAN Yeah, I'm still not too sure here. The length of the packed buffer still needs to be "obvious to the eye", since the same length must be manually passed to the sanity check as well as to the actual pack_fields() call, otherwise "bad things" will happen. I believe in the users' ability to structure their code in such a way that this is not hard. Especially since the sanity checks now rely on BUILD_BUG_ON(), and that can be technically be placed anywhere in the code, I expect the best place to put it is exactly near the pack_fields() call. That way, it's the most obvious that the buffer length declared for the sanity check is identical to the one during actual usage. Especially if the usage can be restricted to just one function or two. > > Sorry for not posting code and just telling a story about it. There > > isn't much to show other than unfinished ideas with conflicting > > requirements. So I thought maybe I could use a little help with some > > brainstorming. Of course, do let me know if you are not that interested > > in making the ICE_CTX_STORE tables idea a part of the packing core. > > > > Thanks! > > I think moving this into core would be fantastic. Since pretty much > every driver handles these sanity checks the same way, I also think that > moving those into the core and making them WARN or similar seems > reasonable, so we can make the pack/unpack as void. > > It would be interesting to see a comparison of the resulting module size. Yeah, I'm also very much interested in comparisons: text size vs rodata size vs dynamic memory usage. With the pack_fields() API in the sja1105 driver, I can shrink an enormous amount of u64 unpacked structure fields to just u8. I also really like the idea of compile-time sanity checks, and I'm curious how much that matters in a benchmark or something. Anyway, I don't have a complete rework at the moment, so there's that. > How much of this do you have code for now? Well, I do have code, but it's not yet complete :) I've updated by https://github.com/vladimiroltean/linux/tree/packing-selftests branch on top of your patch set. Until HEAD~1, I've tested the sja1105 driver and it still works. The last patch - the bulk of the conversion actually - is extremely tedious and is not yet ready (it doesn't even compile). Yet, with a bit of imagination, it should be enough to provide an example and hopefully move the discussion forward.
On 9/3/2024 3:13 PM, Vladimir Oltean wrote: > On Tue, Sep 03, 2024 at 02:16:59PM -0700, Jacob Keller wrote: >> I only kept my interface in terms of lsb+size because I did not want to >> attempt to re-write the table. I actually did re-write the table at >> first, and discovered a documentation bug because our documentation for >> the table has incorrect lsb/msb for some of the fields in some versions >> of the doc! >> >> I ultimately don't mind switching to the packing convention of start/end >> (though my brain does have trouble sometimes thinking of the start as >> the higher bit...) > > Well, you can keep the ICE_CTX_STORE() macro formatted in whichever way > you want, including Width+LSB, but reimplement it to generate a struct > packed_field, formatted as startbit+endbit (endbit=LSB, startbit=LSB+Width-1). > I actually encourage any user of the packing library to write the code > in the closest way possible to the field definitions as they appear in > the documentation. What do you think? Our documentation has both lsb, msb, and the width. I agree, and i think it would benefit if the packing documentation suggests this as well. > >>> I came to the realization that this API would be nice exactly because >>> otherwise, you need to duplicate the field definitions, once for the >>> pack() call and once for the unpack(). But if they're tables, you don't. >>> >>> I'm looking at ways in which this new API could solve all the shortcomings >>> I don't like with lib/packing in general. Without being even aware of >>> ICE_CTX_STORE, I had actually implemented the type conversion to smaller >>> unpacked u8/u16/u32 types in exactly the same way. >> >> I think having this in the core API with a standardized table, along >> with support for unpacking the types would be great! > > Cool then! > >>> I also wish to do something about the way in which lib/packing deals >>> with errors. I don't think it's exactly great for every driver to have >>> to dump stack and ignore them. And also, since they're programming >>> errors, it's odd (and bad for performance) to perform the sanity checks >>> for every function call, instead of just once, say at boot time. So I >>> had thought of a third new API call: packed_fields_validate(), which >>> runs at module_init() in the consumer driver (ice, sja1105, ...) and >>> operates on the field tables. >>> >> >> It does seem reasonable to me that we can move the sanity checks here, >> especially since the main programmer error is if this table is incorrect >> in one of these ways. > > Actually I did manage to cook something up which I like even more than > packed_fields_validate(): a compile-time sanity check. I'm not completely > happy with it, just reasonably happy. You'll see.>>> Basically it is a singular replacement for these existing verifications >>> in pack() and unpack(): >>> >>> /* startbit is expected to be larger than endbit, and both are >>> * expected to be within the logically addressable range of the buffer. >>> */ >>> if (unlikely(startbit < endbit || startbit >= 8 * pbuflen)) >>> /* Invalid function call */ >>> return -EINVAL; >>> >>> value_width = startbit - endbit + 1; >>> if (unlikely(value_width > 64)) >>> return -ERANGE; >>> >>> so you actually need to tell packed_fields_validate() what is the size >>> of the buffer you intend to run pack_fields() and unpack_fields() on. >>> I don't see it as a problem at all that the packed buffer size must be >>> fixed and known ahead of time - you also operate on fixed ICE_RXQ_CTX_SZ >>> and ... hmmm... txq_ctx[22]? sized buffers. >>> >> >> Yea, these are fixed sizes. Strictly I think we could have a macro >> defining the size of the Tx queue context as well.... > > Yeah, you'll need than when you'll see what I've prepared below :) > >>> But this packed_fields_validate() idea quickly creates another set of 2 >>> problems which I didn't get to the bottom of: >>> >>> 1. Some sanity checks in pack() are dynamic and cannot be run at >>> module_init() time. Like this: >>> >>> /* Check if "uval" fits in "value_width" bits. >>> * If value_width is 64, the check will fail, but any >>> * 64-bit uval will surely fit. >>> */ >>> if (value_width < 64 && uval >= (1ull << value_width)) >>> /* Cannot store "uval" inside "value_width" bits. >>> * Truncating "uval" is most certainly not desirable, >>> * so simply erroring out is appropriate. >>> */ >>> return -ERANGE; >>> >> >> If we support u8/u16/u32/u64 sizes as well, you could check that the >> size of the unpacked variable too. Could this data be in the table? Oh I >> guess technically not because we are checking if the actual value passed >> fits. I think keeping this but making it WARN would be sufficient? > > The u8/u16/u32/u64 unpacked field size will absolutely be held in the > struct packed_field tables. > >>> The worse part is that the check is actually useful. It led to the >>> quick identification of the bug behind commit 24deec6b9e4a ("net: >>> dsa: sja1105: disallow C45 transactions on the BASE-TX MDIO bus"), >>> rather than seeing a silent failure. But... I would still really like >>> the revised lib/packing API to return void for pack_fields() and >>> unpack_fields(). Not really sure how to reconcile these. >>> >> >> Since this is generally programmer error (not something where uAPI can >> affect it) what about making these WARN in the core? > > Yes, I went for demoting the -ERANGE error in the truncation case to a > WARN() that is printed for both the "int pack()" as well as the new > "void __pack()" call. Practically, it doesn't make a big difference > whether we bail out or do nothing, as long as we loudly complain in > either case. It's the complaint that leads to the bug getting easily > identified and fixed. > Right. >>> 2. How to make sure that the pbuflen passed to packed_fields_validate() >>> will be the same one as the pbuflen passed to all subsequent pack_fields() >>> calls validated against that very pbuflen? >> >> I guess you either duplicate the check and WARN, or you don't, and let >> it panic on the invalid memory? But I guess that would only actually >> panic with something like KASAN > > Yeah, I'm still not too sure here. The length of the packed buffer still > needs to be "obvious to the eye", since the same length must be manually > passed to the sanity check as well as to the actual pack_fields() call, > otherwise "bad things" will happen. I believe in the users' ability to > structure their code in such a way that this is not hard. Especially > since the sanity checks now rely on BUILD_BUG_ON(), and that can be > technically be placed anywhere in the code, I expect the best place to > put it is exactly near the pack_fields() call. That way, it's the most > obvious that the buffer length declared for the sanity check is > identical to the one during actual usage. Especially if the usage can be > restricted to just one function or two. > Makes sense. I may need to do a little work on the ice driver to make that fit well :D I think currently we have a packing for the Tx queue context in different places. >>> Sorry for not posting code and just telling a story about it. There >>> isn't much to show other than unfinished ideas with conflicting >>> requirements. So I thought maybe I could use a little help with some >>> brainstorming. Of course, do let me know if you are not that interested >>> in making the ICE_CTX_STORE tables idea a part of the packing core. >>> >>> Thanks! >> >> I think moving this into core would be fantastic. Since pretty much >> every driver handles these sanity checks the same way, I also think that >> moving those into the core and making them WARN or similar seems >> reasonable, so we can make the pack/unpack as void. >> >> It would be interesting to see a comparison of the resulting module size. > > Yeah, I'm also very much interested in comparisons: text size vs rodata > size vs dynamic memory usage. With the pack_fields() API in the sja1105 > driver, I can shrink an enormous amount of u64 unpacked structure fields > to just u8. I also really like the idea of compile-time sanity checks, > and I'm curious how much that matters in a benchmark or something. > Anyway, I don't have a complete rework at the moment, so there's that. > Yea. Actually my entire initial motivation for starting looking at reworking ice was the weird sizes used in our unpacked context structure... :D >> How much of this do you have code for now? > > Well, I do have code, but it's not yet complete :) > I've updated by https://github.com/vladimiroltean/linux/tree/packing-selftests > branch on top of your patch set. Until HEAD~1, I've tested the sja1105 > driver and it still works. The last patch - the bulk of the conversion > actually - is extremely tedious and is not yet ready (it doesn't even > compile). Yet, with a bit of imagination, it should be enough to provide > an example and hopefully move the discussion forward. Ok sounds good. I'll take a look at this.
diff --git a/drivers/net/ethernet/intel/ice/ice_common.h b/drivers/net/ethernet/intel/ice/ice_common.h index 27208a60cece..550fdd08e6aa 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.h +++ b/drivers/net/ethernet/intel/ice/ice_common.h @@ -5,6 +5,7 @@ #define _ICE_COMMON_H_ #include <linux/bitfield.h> +#include <linux/packing.h> #include "ice.h" #include "ice_type.h" @@ -93,8 +94,16 @@ bool ice_check_sq_alive(struct ice_hw *hw, struct ice_ctl_q_info *cq); int ice_aq_q_shutdown(struct ice_hw *hw, bool unloading); void ice_fill_dflt_direct_cmd_desc(struct ice_aq_desc *desc, u16 opcode); extern const struct ice_ctx_ele ice_tlan_ctx_info[]; -int ice_set_ctx(struct ice_hw *hw, u8 *src_ctx, u8 *dest_ctx, - const struct ice_ctx_ele *ce_info); +extern const struct ice_ctx_ele ice_rlan_ctx_info[]; + +void __ice_pack_queue_ctx(const void *ctx, void *buf, size_t len, + const struct ice_ctx_ele *ctx_info); + +#define ice_pack_rxq_ctx(rlan_ctx, buf) \ + __ice_pack_queue_ctx((rlan_ctx), (buf), sizeof(buf), ice_rlan_ctx_info) + +#define ice_pack_txq_ctx(tlan_ctx, buf) \ + __ice_pack_queue_ctx((tlan_ctx), (buf), sizeof(buf), ice_tlan_ctx_info) extern struct mutex ice_global_cfg_lock_sw; diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c index f448d3a84564..1881ce8105ca 100644 --- a/drivers/net/ethernet/intel/ice/ice_base.c +++ b/drivers/net/ethernet/intel/ice/ice_base.c @@ -907,8 +907,7 @@ ice_vsi_cfg_txq(struct ice_vsi *vsi, struct ice_tx_ring *ring, ice_setup_tx_ctx(ring, &tlan_ctx, pf_q); /* copy context contents into the qg_buf */ qg_buf->txqs[0].txq_id = cpu_to_le16(pf_q); - ice_set_ctx(hw, (u8 *)&tlan_ctx, qg_buf->txqs[0].txq_ctx, - ice_tlan_ctx_info); + ice_pack_txq_ctx(&tlan_ctx, qg_buf->txqs[0].txq_ctx); /* init queue specific tail reg. It is referred as * transmit comm scheduler queue doorbell. diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index c4b24ba36b5e..09a94c20e16d 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -1387,8 +1387,89 @@ ice_copy_rxq_ctx_to_hw(struct ice_hw *hw, u8 *ice_rxq_ctx, u32 rxq_index) return 0; } +/** + * ice_ctx_pack - Pack Tx/Rx queue context data + * @pbuf: pointer to the packed buffer + * @pbuflen: size of the packed buffer + * @val: pointer to storage for the unpacked data + * @len: size of the unpacked data + * @width: width of bits to pack + * @lsb: least significant bit in the packed buffer + * + * Pack the given field of the Tx or Rx queue context into the hardware data + * buffer. The packed contents are in full Little Endian ordering with the + * least significant 4-byte block first. + * + * The only time that pack() should produce an error is due to invalid bit + * offsets. Thus, errors are logged along with a stack dump. + */ +static void ice_ctx_pack(void *pbuf, size_t pbuflen, const void *val, + size_t len, size_t width, size_t lsb) +{ + size_t msb = lsb + width - 1; + u64 uval; + int err; + + switch (len) { + case sizeof(u8): + uval = *((u8 *)val); + break; + case sizeof(u16): + uval = *((u16 *)val); + break; + case sizeof(u32): + uval = *((u32 *)val); + break; + case sizeof(u64): + uval = *((u64 *)val); + break; + default: + WARN_ONCE(1, "Unexpected size %zd when packing queue context", + len); + return; + } + + err = pack(pbuf, uval, msb, lsb, pbuflen, + QUIRK_LITTLE_ENDIAN | QUIRK_LSW32_IS_FIRST); + if (unlikely(err)) { + if (err == -EINVAL) { + pr_err("MSB (%zd) expected to be larger than LSB (%zd)\n", + msb, lsb); + } else if (err == -ERANGE) { + if ((msb - lsb + 1) > 64) + pr_err("Field %zd-%zd too large for 64 bits!\n", + lsb, msb); + else + pr_err("Cannot store %llx inside field %zd-%zd (would truncate)\n", + uval, lsb, msb); + } + dump_stack(); + } +} + +/** + * __ice_pack_queue_ctx - Pack Tx or Rx queue context + * @ctx: Unpacked queue context structure + * @buf: packed buffer storage + * @len: size of the packed buffer + * @ctx_info: array describing the packing layout + */ +void __ice_pack_queue_ctx(const void *ctx, void *buf, size_t len, + const struct ice_ctx_ele *ctx_info) +{ + for (int f = 0; ctx_info[f].width; f++) { + const struct ice_ctx_ele *field_info = &ctx_info[f]; + const void *field; + + field = (const void *)ctx + field_info->offset; + + ice_ctx_pack(buf, len, field, field_info->size_of, + field_info->width, field_info->lsb); + } +} + /* LAN Rx Queue Context */ -static const struct ice_ctx_ele ice_rlan_ctx_info[] = { +const struct ice_ctx_ele ice_rlan_ctx_info[] = { /* Field Width LSB */ ICE_CTX_STORE(ice_rlan_ctx, head, 13, 0), ICE_CTX_STORE(ice_rlan_ctx, cpuid, 8, 13), @@ -1433,7 +1514,8 @@ int ice_write_rxq_ctx(struct ice_hw *hw, struct ice_rlan_ctx *rlan_ctx, rlan_ctx->prefena = 1; - ice_set_ctx(hw, (u8 *)rlan_ctx, ctx_buf, ice_rlan_ctx_info); + ice_pack_rxq_ctx(rlan_ctx, ctx_buf); + return ice_copy_rxq_ctx_to_hw(hw, ctx_buf, rxq_index); } @@ -4526,205 +4608,6 @@ ice_aq_add_rdma_qsets(struct ice_hw *hw, u8 num_qset_grps, /* End of FW Admin Queue command wrappers */ -/** - * ice_pack_ctx_byte - write a byte to a packed context structure - * @src_ctx: unpacked source context structure - * @dest_ctx: packed destination context data - * @ce_info: context element description - */ -static void ice_pack_ctx_byte(u8 *src_ctx, u8 *dest_ctx, - const struct ice_ctx_ele *ce_info) -{ - u8 src_byte, dest_byte, mask; - u8 *from, *dest; - u16 shift_width; - - /* copy from the next struct field */ - from = src_ctx + ce_info->offset; - - /* prepare the bits and mask */ - shift_width = ce_info->lsb % 8; - mask = GENMASK(ce_info->width - 1 + shift_width, shift_width); - - src_byte = *from; - src_byte <<= shift_width; - src_byte &= mask; - - /* get the current bits from the target bit string */ - dest = dest_ctx + (ce_info->lsb / 8); - - memcpy(&dest_byte, dest, sizeof(dest_byte)); - - dest_byte &= ~mask; /* get the bits not changing */ - dest_byte |= src_byte; /* add in the new bits */ - - /* put it all back */ - memcpy(dest, &dest_byte, sizeof(dest_byte)); -} - -/** - * ice_pack_ctx_word - write a word to a packed context structure - * @src_ctx: unpacked source context structure - * @dest_ctx: packed destination context data - * @ce_info: context element description - */ -static void ice_pack_ctx_word(u8 *src_ctx, u8 *dest_ctx, - const struct ice_ctx_ele *ce_info) -{ - u16 src_word, mask; - __le16 dest_word; - u8 *from, *dest; - u16 shift_width; - - /* copy from the next struct field */ - from = src_ctx + ce_info->offset; - - /* prepare the bits and mask */ - shift_width = ce_info->lsb % 8; - mask = GENMASK(ce_info->width - 1 + shift_width, shift_width); - - /* don't swizzle the bits until after the mask because the mask bits - * will be in a different bit position on big endian machines - */ - src_word = *(u16 *)from; - src_word <<= shift_width; - src_word &= mask; - - /* get the current bits from the target bit string */ - dest = dest_ctx + (ce_info->lsb / 8); - - memcpy(&dest_word, dest, sizeof(dest_word)); - - dest_word &= ~(cpu_to_le16(mask)); /* get the bits not changing */ - dest_word |= cpu_to_le16(src_word); /* add in the new bits */ - - /* put it all back */ - memcpy(dest, &dest_word, sizeof(dest_word)); -} - -/** - * ice_pack_ctx_dword - write a dword to a packed context structure - * @src_ctx: unpacked source context structure - * @dest_ctx: packed destination context data - * @ce_info: context element description - */ -static void ice_pack_ctx_dword(u8 *src_ctx, u8 *dest_ctx, - const struct ice_ctx_ele *ce_info) -{ - u32 src_dword, mask; - __le32 dest_dword; - u8 *from, *dest; - u16 shift_width; - - /* copy from the next struct field */ - from = src_ctx + ce_info->offset; - - /* prepare the bits and mask */ - shift_width = ce_info->lsb % 8; - mask = GENMASK(ce_info->width - 1 + shift_width, shift_width); - - /* don't swizzle the bits until after the mask because the mask bits - * will be in a different bit position on big endian machines - */ - src_dword = *(u32 *)from; - src_dword <<= shift_width; - src_dword &= mask; - - /* get the current bits from the target bit string */ - dest = dest_ctx + (ce_info->lsb / 8); - - memcpy(&dest_dword, dest, sizeof(dest_dword)); - - dest_dword &= ~(cpu_to_le32(mask)); /* get the bits not changing */ - dest_dword |= cpu_to_le32(src_dword); /* add in the new bits */ - - /* put it all back */ - memcpy(dest, &dest_dword, sizeof(dest_dword)); -} - -/** - * ice_pack_ctx_qword - write a qword to a packed context structure - * @src_ctx: unpacked source context structure - * @dest_ctx: packed destination context data - * @ce_info: context element description - */ -static void ice_pack_ctx_qword(u8 *src_ctx, u8 *dest_ctx, - const struct ice_ctx_ele *ce_info) -{ - u64 src_qword, mask; - __le64 dest_qword; - u8 *from, *dest; - u16 shift_width; - - /* copy from the next struct field */ - from = src_ctx + ce_info->offset; - - /* prepare the bits and mask */ - shift_width = ce_info->lsb % 8; - mask = GENMASK_ULL(ce_info->width - 1 + shift_width, shift_width); - - /* don't swizzle the bits until after the mask because the mask bits - * will be in a different bit position on big endian machines - */ - src_qword = *(u64 *)from; - src_qword <<= shift_width; - src_qword &= mask; - - /* get the current bits from the target bit string */ - dest = dest_ctx + (ce_info->lsb / 8); - - memcpy(&dest_qword, dest, sizeof(dest_qword)); - - dest_qword &= ~(cpu_to_le64(mask)); /* get the bits not changing */ - dest_qword |= cpu_to_le64(src_qword); /* add in the new bits */ - - /* put it all back */ - memcpy(dest, &dest_qword, sizeof(dest_qword)); -} - -/** - * ice_set_ctx - set context bits in packed structure - * @hw: pointer to the hardware structure - * @src_ctx: pointer to a generic non-packed context structure - * @dest_ctx: pointer to memory for the packed structure - * @ce_info: List of Rx context elements - */ -int ice_set_ctx(struct ice_hw *hw, u8 *src_ctx, u8 *dest_ctx, - const struct ice_ctx_ele *ce_info) -{ - int f; - - for (f = 0; ce_info[f].width; f++) { - /* We have to deal with each element of the FW response - * using the correct size so that we are correct regardless - * of the endianness of the machine. - */ - if (ce_info[f].width > (ce_info[f].size_of * BITS_PER_BYTE)) { - ice_debug(hw, ICE_DBG_QCTX, "Field %d width of %d bits larger than size of %d byte(s) ... skipping write\n", - f, ce_info[f].width, ce_info[f].size_of); - continue; - } - switch (ce_info[f].size_of) { - case sizeof(u8): - ice_pack_ctx_byte(src_ctx, dest_ctx, &ce_info[f]); - break; - case sizeof(u16): - ice_pack_ctx_word(src_ctx, dest_ctx, &ce_info[f]); - break; - case sizeof(u32): - ice_pack_ctx_dword(src_ctx, dest_ctx, &ce_info[f]); - break; - case sizeof(u64): - ice_pack_ctx_qword(src_ctx, dest_ctx, &ce_info[f]); - break; - default: - return -EINVAL; - } - } - - return 0; -} - /** * ice_get_lan_q_ctx - get the LAN queue context for the given VSI and TC * @hw: pointer to the HW struct