diff mbox

[net,1/2] netlink: add NLA_U8_BUGGY attribute type

Message ID 20171202202332.10205-1-johannes@sipsolutions.net (mailing list archive)
State Not Applicable
Delegated to: Johannes Berg
Headers show

Commit Message

Johannes Berg Dec. 2, 2017, 8:23 p.m. UTC
From: Johannes Berg <johannes.berg@intel.com>

This netlink type is used only for backwards compatibility
with broken userspace that used the wrong size for a given
u8 attribute, which is now rejected. It would've been wrong
before already, since on big endian the wrong value (always
zero) would be used by the kernel, but we can't break the
existing deployed userspace - hostapd for example now fails
to initialize entirely.

We could try to fix up the big endian problem here, but we
don't know *how* userspace misbehaved - if using nla_put_u32
then we could, but we also found a debug tool (which we'll
ignore for the purposes of this regression) that was putting
the padding into the length.

Fixes: 28033ae4e0f5 ("net: netlink: Update attr validation to require exact length for some types")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
 include/net/netlink.h | 1 +
 lib/nlattr.c          | 1 +
 2 files changed, 2 insertions(+)

Comments

David Ahern Dec. 2, 2017, 10:48 p.m. UTC | #1
On 12/2/17 1:23 PM, Johannes Berg wrote:
> From: Johannes Berg <johannes.berg@intel.com>
> 
> This netlink type is used only for backwards compatibility
> with broken userspace that used the wrong size for a given
> u8 attribute, which is now rejected. It would've been wrong
> before already, since on big endian the wrong value (always
> zero) would be used by the kernel, but we can't break the
> existing deployed userspace - hostapd for example now fails
> to initialize entirely.
> 
> We could try to fix up the big endian problem here, but we
> don't know *how* userspace misbehaved - if using nla_put_u32
> then we could, but we also found a debug tool (which we'll
> ignore for the purposes of this regression) that was putting
> the padding into the length.
> 
> Fixes: 28033ae4e0f5 ("net: netlink: Update attr validation to require exact length for some types")
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>

Hi Johannes:

I have been really busy the past 2 weeks, so have not gotten around to
dealing with this. I was planning to partially revert 28033ae4e0f5 --
change it from failure to log an error message so buggy commands can be
fixed.

David
David Miller Dec. 5, 2017, 4:31 p.m. UTC | #2
From: Johannes Berg <johannes@sipsolutions.net>
Date: Sat,  2 Dec 2017 21:23:31 +0100

> From: Johannes Berg <johannes.berg@intel.com>
> 
> This netlink type is used only for backwards compatibility
> with broken userspace that used the wrong size for a given
> u8 attribute, which is now rejected. It would've been wrong
> before already, since on big endian the wrong value (always
> zero) would be used by the kernel, but we can't break the
> existing deployed userspace - hostapd for example now fails
> to initialize entirely.
> 
> We could try to fix up the big endian problem here, but we
> don't know *how* userspace misbehaved - if using nla_put_u32
> then we could, but we also found a debug tool (which we'll
> ignore for the purposes of this regression) that was putting
> the padding into the length.
> 
> Fixes: 28033ae4e0f5 ("net: netlink: Update attr validation to require exact length for some types")
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>

We're stuck with this thing forever... I'd like to consider other
options.

I've seen this problem at least one time before, therefore I
suggest when we see a U8 attribute with a U32's length:

1) We access it as a u32, this takes care of all endianness
   issues.

2) We emit a warning so that the app gets fixes.

Thanks.
Johannes Berg Dec. 5, 2017, 4:34 p.m. UTC | #3
On Tue, 2017-12-05 at 11:31 -0500, David Miller wrote:
> 
> > We could try to fix up the big endian problem here, but we
> > don't know *how* userspace misbehaved - if using nla_put_u32
> > then we could, but we also found a debug tool (which we'll
> > ignore for the purposes of this regression) that was putting
> > the padding into the length.

> We're stuck with this thing forever... I'd like to consider other
> options.
> 
> I've seen this problem at least one time before, therefore I
> suggest when we see a U8 attribute with a U32's length:
> 
> 1) We access it as a u32, this takes care of all endianness
>    issues.

Possible, but as I said above, I've seen at least one tool (a debug
only script) now that will actually emit a U8 followed by 3 bytes of
padding to make it netlink-aligned, but set the length to 4. That would
be broken by making this change.

I'm not saying this is bad - but there are different levels of
compatibility and I'd probably go for "bug compatibility" here rather
than "fix-it-up compatibility".

Your call, ultimately - I've already fixed the tool I had found :-)

> 2) We emit a warning so that the app gets fixes.

For sure.

johannes
David Miller Dec. 5, 2017, 4:41 p.m. UTC | #4
From: Johannes Berg <johannes@sipsolutions.net>
Date: Tue, 05 Dec 2017 17:34:21 +0100

> On Tue, 2017-12-05 at 11:31 -0500, David Miller wrote:
>> 
>> > We could try to fix up the big endian problem here, but we
>> > don't know *how* userspace misbehaved - if using nla_put_u32
>> > then we could, but we also found a debug tool (which we'll
>> > ignore for the purposes of this regression) that was putting
>> > the padding into the length.
> 
>> We're stuck with this thing forever... I'd like to consider other
>> options.
>> 
>> I've seen this problem at least one time before, therefore I
>> suggest when we see a U8 attribute with a U32's length:
>> 
>> 1) We access it as a u32, this takes care of all endianness
>>    issues.
> 
> Possible, but as I said above, I've seen at least one tool (a debug
> only script) now that will actually emit a U8 followed by 3 bytes of
> padding to make it netlink-aligned, but set the length to 4. That would
> be broken by making this change.

There is no reasonable interpretation for what that application is
doing, so I think we can safely call that case as buggy.

We are only trying to handle the situation where a U8 attribute
is presented as a bonafide U32 or a correct U8.

Does this make sense?
Johannes Berg Dec. 5, 2017, 5:30 p.m. UTC | #5
On Tue, 2017-12-05 at 11:41 -0500, David Miller wrote:
> 
> There is no reasonable interpretation for what that application is
> doing, so I think we can safely call that case as buggy.
> 
> We are only trying to handle the situation where a U8 attribute
> is presented as a bonafide U32 or a correct U8.
> 
> Does this make sense?

Well the application is buggy, but we don't really know in what way?
Perhaps somebody even did the equivalent of
    nla_put_u32(ATTR, cpu_to_le32(x))
when they noticed it was broken on BE, and end up with a similar case
as I had above.

I don't think there's a good solution to this, applications must be
fixed anyhow. I'm just saying that I'd save the extra code and stay
compatible with applications as written today, even if they're now
broken on BE - and rely on the warning to fix it. Trying to fix it up
seems to have the potential to just break something else.

johannes
David Miller Dec. 5, 2017, 5:40 p.m. UTC | #6
From: Johannes Berg <johannes@sipsolutions.net>
Date: Tue, 05 Dec 2017 18:30:10 +0100

> On Tue, 2017-12-05 at 11:41 -0500, David Miller wrote:
>> 
>> There is no reasonable interpretation for what that application is
>> doing, so I think we can safely call that case as buggy.
>> 
>> We are only trying to handle the situation where a U8 attribute
>> is presented as a bonafide U32 or a correct U8.
>> 
>> Does this make sense?
> 
> Well the application is buggy, but we don't really know in what way?
> Perhaps somebody even did the equivalent of
>     nla_put_u32(ATTR, cpu_to_le32(x))
> when they noticed it was broken on BE, and end up with a similar case
> as I had above.
> 
> I don't think there's a good solution to this, applications must be
> fixed anyhow. I'm just saying that I'd save the extra code and stay
> compatible with applications as written today, even if they're now
> broken on BE - and rely on the warning to fix it. Trying to fix it up
> seems to have the potential to just break something else.

You might be right.

Ok let's just go with the warning + existing behavior for now.
David Ahern Dec. 5, 2017, 6:15 p.m. UTC | #7
On 12/5/17 10:40 AM, David Miller wrote:
> From: Johannes Berg <johannes@sipsolutions.net>
> Date: Tue, 05 Dec 2017 18:30:10 +0100
> 
>> On Tue, 2017-12-05 at 11:41 -0500, David Miller wrote:
>>>
>>> There is no reasonable interpretation for what that application is
>>> doing, so I think we can safely call that case as buggy.
>>>
>>> We are only trying to handle the situation where a U8 attribute
>>> is presented as a bonafide U32 or a correct U8.
>>>
>>> Does this make sense?
>>
>> Well the application is buggy, but we don't really know in what way?
>> Perhaps somebody even did the equivalent of
>>     nla_put_u32(ATTR, cpu_to_le32(x))
>> when they noticed it was broken on BE, and end up with a similar case
>> as I had above.
>>
>> I don't think there's a good solution to this, applications must be
>> fixed anyhow. I'm just saying that I'd save the extra code and stay
>> compatible with applications as written today, even if they're now
>> broken on BE - and rely on the warning to fix it. Trying to fix it up
>> seems to have the potential to just break something else.

+1

> 
> You might be right.
> 
> Ok let's just go with the warning + existing behavior for now.

Is the patch I sent as an attachment good or should I re-send
standalone? (don't see it in patchwork)
David Miller Dec. 5, 2017, 7:08 p.m. UTC | #8
From: David Ahern <dsahern@gmail.com>
Date: Tue, 5 Dec 2017 11:15:29 -0700

> Is the patch I sent as an attachment good or should I re-send
> standalone? (don't see it in patchwork)

Patchwork has been wonky laterly, please resubmit as a fresh
email for rewiew.

Thanks.
diff mbox

Patch

diff --git a/include/net/netlink.h b/include/net/netlink.h
index 0c154f98e987..448a9b86c959 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -180,6 +180,7 @@  enum {
 	NLA_S32,
 	NLA_S64,
 	NLA_BITFIELD32,
+	NLA_U8_BUGGY, /* don't use this - only for bug-ward compatibility */
 	__NLA_TYPE_MAX,
 };
 
diff --git a/lib/nlattr.c b/lib/nlattr.c
index 8bf78b4b78f0..2b89d25d4745 100644
--- a/lib/nlattr.c
+++ b/lib/nlattr.c
@@ -28,6 +28,7 @@  static const u8 nla_attr_len[NLA_TYPE_MAX+1] = {
 };
 
 static const u8 nla_attr_minlen[NLA_TYPE_MAX+1] = {
+	[NLA_U8_BUGGY]	= sizeof(u8),
 	[NLA_MSECS]	= sizeof(u64),
 	[NLA_NESTED]	= NLA_HDRLEN,
 };