Message ID | 20230731121247.3972783-1-linma@zju.edu.cn (mailing list archive) |
---|---|
State | Rejected |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v1,1/2] netlink: let len field used to parse type-not-care nested attrs | expand |
On Mon, 31 Jul 2023 20:12:47 +0800 Lin Ma wrote: > In short, the very direct idea to fix such lengh-check-forgotten bug is > add nla_len() checks like > > if (nla_len(nla) < SOME_LEN) > return -EINVAL; > > However, this is tedious and just like Leon said: add another layer of > cabal knowledge. The better solution should leverage the nla_policy and > discard nlattr whose length is invalid when doing parsing. That is, we > should defined a nested_policy for the X above like Hard no. Putting array index into attr type is an advanced case and the parsing code has to be able to deal with low level netlink details. Higher level API should remove the nla_for_each_nested() completely which is rather hard to achieve here. Nacked-by: Jakub Kicinski <kuba@kernel.org>
Hello Jakub, > > > > However, this is tedious and just like Leon said: add another layer of > > cabal knowledge. The better solution should leverage the nla_policy and > > discard nlattr whose length is invalid when doing parsing. That is, we > > should defined a nested_policy for the X above like > > Hard no. Putting array index into attr type is an advanced case and the > parsing code has to be able to deal with low level netlink details. Well, I just known that the type field for those attributes is used as array index. Hence, for this advanced case, could we define another NLA type, maybe NLA_NESTED_IDXARRAY enum? That may be much clearer against modifying existing code. > Higher level API should remove the nla_for_each_nested() completely > which is rather hard to achieve here. By investigating the code uses nla_for_each_nested macro. There are basically two scenarios: 1. manually parse nested attributes whose type is not cared (the advance case use type as index here). 2. manually parse nested attributes for *one* specific type. Such code do nla_type check. From the API side, to completely remove nla_for_each_nested and avoid the manual parsing. I think we can choose two solutions. Solution-1: add a parsing helper that receives a function pointer as an argument, it will call this pointer after carefully verify the type and length of an attribute. Solution-2: add a parsing helper that traverses this nested twice, the first time to do counting size for allocating heap buffer (or stack buffer from the caller if the max count is known). The second time to fill this buffer with attribute pointers. Which one is preferred? Please enlighten me about this and I can try to propose a fix. (I personally like the solution-2 as it works like the existing parsers like nla_parse) > > Nacked-by: Jakub Kicinski <kuba@kernel.org> Thanks Lin
On Tue, 1 Aug 2023 10:00:01 +0800 (GMT+08:00) Lin Ma wrote: > > > However, this is tedious and just like Leon said: add another layer of > > > cabal knowledge. The better solution should leverage the nla_policy and > > > discard nlattr whose length is invalid when doing parsing. That is, we > > > should defined a nested_policy for the X above like > > > > Hard no. Putting array index into attr type is an advanced case and the > > parsing code has to be able to deal with low level netlink details. > > Well, I just known that the type field for those attributes is used as array > index. > Hence, for this advanced case, could we define another NLA type, maybe > NLA_NESTED_IDXARRAY enum? That may be much clearer against modifying existing > code. What if the value is of a complex type (nest)? For 10th time, if someone does "interesting things" they must know what they're doing. > > Higher level API should remove the nla_for_each_nested() completely > > which is rather hard to achieve here. > > By investigating the code uses nla_for_each_nested macro. There are basically > two scenarios: > > 1. manually parse nested attributes whose type is not cared (the advance case > use type as index here). > 2. manually parse nested attributes for *one* specific type. Such code do > nla_type check. > > From the API side, to completely remove nla_for_each_nested and avoid the > manual parsing. I think we can choose two solutions. > > Solution-1: add a parsing helper that receives a function pointer as an > argument, it will call this pointer after carefully verify the > type and length of an attribute. > > Solution-2: add a parsing helper that traverses this nested twice, the first > time to do counting size for allocating heap buffer (or stack > buffer from the caller if the max count is known). The second > time to fill this buffer with attribute pointers. > > Which one is preferred? Please enlighten me about this and I can try to propose > a fix. (I personally like the solution-2 as it works like the existing parsers > like nla_parse) Your initial fix was perfectly fine. We do need a solution for a normal multi-attr parse, but that's not the case here.
Hello Jakub, > > Your initial fix was perfectly fine. > > We do need a solution for a normal multi-attr parse, but that's not > the case here. Cool
On Mon, Jul 31, 2023 at 12:03:26PM -0700, Jakub Kicinski wrote: > On Mon, 31 Jul 2023 20:12:47 +0800 Lin Ma wrote: > > In short, the very direct idea to fix such lengh-check-forgotten bug is > > add nla_len() checks like > > > > if (nla_len(nla) < SOME_LEN) > > return -EINVAL; > > > > However, this is tedious and just like Leon said: add another layer of > > cabal knowledge. The better solution should leverage the nla_policy and > > discard nlattr whose length is invalid when doing parsing. That is, we > > should defined a nested_policy for the X above like > > Hard no. Putting array index into attr type is an advanced case and the > parsing code has to be able to deal with low level netlink details. Jakub, IMHO, you are lowering too much the separation line between simple vs. advanced use cases. I had no idea that my use-case of passing nested netlink array is counted as advanced usage. Thanks
On Tue, 1 Aug 2023 11:11:17 +0300 Leon Romanovsky wrote: > IMHO, you are lowering too much the separation line between simple vs. > advanced use cases. > > I had no idea that my use-case of passing nested netlink array is counted > as advanced usage. Agreed, that's a fair point. I'm guessing it was inspired by the ethtool stats? (Which in hindsight were a mistake on my part.) For the longest time there was no docs or best practices for netlink. We have the documentation and more infrastructure in place now. I hope if you wrote the code today the distinction would have been clearer. If we start adding APIs for various one-(two?)-offs from the past we'll never dig ourselves out of the "no idea what's the normal use of these APIs" hole..
On Tue, Aug 01, 2023 at 10:57:26AM -0700, Jakub Kicinski wrote: > On Tue, 1 Aug 2023 11:11:17 +0300 Leon Romanovsky wrote: > > IMHO, you are lowering too much the separation line between simple vs. > > advanced use cases. > > > > I had no idea that my use-case of passing nested netlink array is counted > > as advanced usage. > > Agreed, that's a fair point. I'm guessing it was inspired by the > ethtool stats? (Which in hindsight were a mistake on my part.) I don't remember which part of kernel can be blamed for it. :) > > For the longest time there was no docs or best practices for netlink. > We have the documentation and more infrastructure in place now. > I hope if you wrote the code today the distinction would have been > clearer. > > If we start adding APIs for various one-(two?)-offs from the past > we'll never dig ourselves out of the "no idea what's the normal use > of these APIs" hole.. I agree with this sentence, just afraid that it is unrealistic goal, due to extensive flexibility in netlink UAPI toward user-space, which allows you to shoot yourself in the foot without even noticing it. Thanks
Hello there, > For the longest time there was no docs or best practices for netlink. > We have the documentation and more infrastructure in place now. > I hope if you wrote the code today the distinction would have been > clearer. > > If we start adding APIs for various one-(two?)-offs from the past > we'll never dig ourselves out of the "no idea what's the normal use > of these APIs" hole.. This is true. Actually, those check missing codes are mostly old codes and modern netlink consumers will choose the general netlink interface which can automatically get attributes description from YAML and never need to do things like *manual parsing* anymore. However, according to my practice in auditing the code, I found there are some general netlink interface users confront other issues like choosing GENL_DONT_VALIDATE_STRICT without thinking or forgetting add a new nla_policy when introducing new attributes. To this end, I'm currently writing a simple documentation about Netlink interface best practices for the kernel developer (the newly coming docs are mostly about the user API part). I guess I will put this doc in Documentation/staging :) and I will finish the draft ASAP. Thanks Lin
On Wed, 2 Aug 2023 08:26:06 +0800 (GMT+08:00) Lin Ma wrote: > This is true. Actually, those check missing codes are mostly old codes and > modern netlink consumers will choose the general netlink interface which > can automatically get attributes description from YAML and never need to > do things like *manual parsing* anymore. > > However, according to my practice in auditing the code, I found there are > some general netlink interface users confront other issues like choosing > GENL_DONT_VALIDATE_STRICT without thinking or forgetting add a new > nla_policy when introducing new attributes. > > To this end, I'm currently writing a simple documentation about Netlink > interface best practices for the kernel developer (the newly coming docs > are mostly about the user API part). Keep in mind that even most of the genetlink stuff is pretty old at this stage. ethtool is probably the first reasonably modern family. But do send docs, we'll review and go from there :)
diff --git a/include/net/netlink.h b/include/net/netlink.h index b12cd957abb4..d825a5672161 100644 --- a/include/net/netlink.h +++ b/include/net/netlink.h @@ -229,6 +229,9 @@ enum nla_policy_validation { * nested header (or empty); len field is used if * nested_policy is also used, for the max attr * number in the nested policy. + * For NLA_NESTED whose nested nlattr is not necessary, + * the len field will indicate the exptected length of + * them for checking. * NLA_U8, NLA_U16, * NLA_U32, NLA_U64, * NLA_S8, NLA_S16, @@ -372,6 +375,9 @@ struct nla_policy { _NLA_POLICY_NESTED(ARRAY_SIZE(policy) - 1, policy) #define NLA_POLICY_NESTED_ARRAY(policy) \ _NLA_POLICY_NESTED_ARRAY(ARRAY_SIZE(policy) - 1, policy) +/* not care about the nested attributes, just do length check */ +#define NLA_POLICY_NESTED_NO_TYPE(length) \ + _NLA_POLICY_NESTED(length, NULL) #define NLA_POLICY_BITFIELD32(valid) \ { .type = NLA_BITFIELD32, .bitfield32_valid = valid } diff --git a/lib/nlattr.c b/lib/nlattr.c index 489e15bde5c1..29a412b41d28 100644 --- a/lib/nlattr.c +++ b/lib/nlattr.c @@ -488,6 +488,18 @@ static int validate_nla(const struct nlattr *nla, int maxtype, */ return err; } + } else if (pt->len) { + /* length set without nested_policy, the len field will + * be used to check those nested attributes here, + * we will not do parse here but just validation as the + * consumers will do manual parsing. + */ + const struct nlattr *nla_nested; + int rem; + + nla_for_each_attr(nla_nested, nla_data(nla), nla_len(nla), rem) + if (nla_len(nla_nested) < pt->len) + return -EINVAL; } break; case NLA_NESTED_ARRAY:
Recently I found several manual parsing cases for nested attributes whose fix is rather trivial. The pattern for those like below const struct nla_policy y[...] = { ... X = { .type = NLA_NESTED }, ... } nla_for_each_nested/attr(nla, tb[X], ...) { // nla_type never used ... x = nla_data(nla) // directly access nla without length checking .... } One example can be found in discussion at: https://lore.kernel.org/all/20230723074504.3706691-1-linma@zju.edu.cn/ In short, the very direct idea to fix such lengh-check-forgotten bug is add nla_len() checks like if (nla_len(nla) < SOME_LEN) return -EINVAL; However, this is tedious and just like Leon said: add another layer of cabal knowledge. The better solution should leverage the nla_policy and discard nlattr whose length is invalid when doing parsing. That is, we should defined a nested_policy for the X above like X = { NLA_POLICY_NESTED(Z) }, But unfortunately, as said above, the nla_type is never used in such manual parsing cases, which means is difficult to defined a nested policy Z without breaking user space (they may put weird value in type of these nlattrs, we have no idea). To this end, this commit uses the len field in nla_policy crafty and allow the existing validate_nla checks such type-not-care nested attrs. In current implementation, for attribute with type NLA_NESTED, the len field used as the length of the nested_policy: { .type = NLA_NESTED, .nested_policy = policy, .len = maxattr } _NLA_POLICY_NESTED(ARRAY_SIZE(policy) - 1, policy) If one nlattr does not provide policy, like the example X, this len field is not used. This means we can leverage this field for our end. This commit introduces one new macro named NLA_POLICY_NESTED_NO_TYPE and let validate_nla() to use the len field as a hint to check the nested attributes. Therefore, such manual parsing code can also define a nla_policy and take advantage of the validation within the existing parsers like nla_parse_deprecated.. Signed-off-by: Lin Ma <linma@zju.edu.cn> --- include/net/netlink.h | 6 ++++++ lib/nlattr.c | 11 +++++++++++ 2 files changed, 17 insertions(+)