Message ID | 20240416152913.1527166-3-omosnace@redhat.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | cipso: make cipso_v4_skbuff_delattr() fully remove the CIPSO options | expand |
On Apr 16, 2024 Ondrej Mosnacek <omosnace@redhat.com> wrote: > > As the comment in this function says, the code currently just clears the > CIPSO part with IPOPT_NOP, rather than removing it completely and > trimming the packet. This is inconsistent with the other > cipso_v4_*_delattr() functions and with CALIPSO (IPv6). This sentence above implies an equality in handling between those three cases that doesn't exist. IPv6 has a radically different approach to IP options, comparisions between the two aren't really valid. Similarly, how we manage IPv4 options on sockets (req or otherwise) is pretty different from how we manage them on packets, and that is intentional. Drop the above sentence, or provide a more detailed explanation of the three aproaches explaining when they can be compared and when they shouldn't be compared. > Implement the proper option removal to make it consistent and producing > more optimal IP packets when there are CIPSO options set. > > Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> > --- > net/ipv4/cipso_ipv4.c | 89 ++++++++++++++++++++++++++++--------------- > 1 file changed, 59 insertions(+), 30 deletions(-) Outside of the SELinux test suite, what testing have you done when you have a Linux box forwarding between a CIPSO network segment and an unlabeled segment? I'm specifically interested in stream based protocols such as TCP. Also, do the rest of the netfilter callbacks handle it okay if the skb changes size in one of the callbacks? Granted it has been *years* since this code was written (decades?), but if I recall correctly, at the time it was a big no-no to change the skb size in a netfilter callback. > diff --git a/net/ipv4/cipso_ipv4.c b/net/ipv4/cipso_ipv4.c > index 75b5e3c35f9bf..c08c6d0262ba8 100644 > --- a/net/ipv4/cipso_ipv4.c > +++ b/net/ipv4/cipso_ipv4.c > @@ -1810,6 +1810,34 @@ static int cipso_v4_genopt(unsigned char *buf, u32 buf_len, > return CIPSO_V4_HDR_LEN + ret_val; > } > > +static int cipso_v4_get_actual_opt_len(const unsigned char *data, int len) > +{ > + int iter = 0, optlen = 0; > + > + /* determining the new total option length is tricky because of > + * the padding necessary, the only thing i can think to do at > + * this point is walk the options one-by-one, skipping the > + * padding at the end to determine the actual option size and > + * from there we can determine the new total option length > + */ > + while (iter < len) { > + if (data[iter] == IPOPT_END) { > + break; > + } else if (data[iter] == IPOPT_NOP) { > + iter++; > + } else { > + if (WARN_ON(data[iter + 1] < 2)) > + iter += 2; > + else > + iter += data[iter + 1]; > + optlen = iter; > + } > + } > + if (WARN_ON(optlen > len)) > + optlen = len; > + return optlen; > +} See my comments in patch 1/2; they apply here. > /** > * cipso_v4_sock_setattr - Add a CIPSO option to a socket > * @sk: the socket > @@ -1985,7 +2013,6 @@ static int cipso_v4_delopt(struct ip_options_rcu __rcu **opt_ptr) > u8 cipso_len; > u8 cipso_off; > unsigned char *cipso_ptr; > - int iter; > int optlen_new; > > cipso_off = opt->opt.cipso - sizeof(struct iphdr); > @@ -2005,28 +2032,8 @@ static int cipso_v4_delopt(struct ip_options_rcu __rcu **opt_ptr) > memmove(cipso_ptr, cipso_ptr + cipso_len, > opt->opt.optlen - cipso_off - cipso_len); > > - /* determining the new total option length is tricky because of > - * the padding necessary, the only thing i can think to do at > - * this point is walk the options one-by-one, skipping the > - * padding at the end to determine the actual option size and > - * from there we can determine the new total option length */ > - iter = 0; > - optlen_new = 0; > - while (iter < opt->opt.optlen) { > - if (opt->opt.__data[iter] == IPOPT_END) { > - break; > - } else if (opt->opt.__data[iter] == IPOPT_NOP) { > - iter++; > - } else { > - if (WARN_ON(opt->opt.__data[iter + 1] < 2)) > - iter += 2; > - else > - iter += opt->opt.__data[iter + 1]; > - optlen_new = iter; > - } > - } > - if (WARN_ON(optlen_new > opt->opt.optlen)) > - optlen_new = opt->opt.optlen; > + optlen_new = cipso_v4_get_actual_opt_len(opt->opt.__data, > + opt->opt.optlen); > hdr_delta = opt->opt.optlen; > opt->opt.optlen = (optlen_new + 3) & ~3; > hdr_delta -= opt->opt.optlen; > @@ -2246,7 +2253,8 @@ int cipso_v4_skbuff_setattr(struct sk_buff *skb, > */ > int cipso_v4_skbuff_delattr(struct sk_buff *skb) > { > - int ret_val; > + int ret_val, cipso_len, hdr_len_actual, new_hdr_len_actual, new_hdr_len, > + hdr_len_delta; Please keep line lengths under 80-chars whenever possible. I know Linus relaxed that requirement a while ago, but I still find the 80-char limit to be a positive thing. > struct iphdr *iph; > struct ip_options *opt = &IPCB(skb)->opt; > unsigned char *cipso_ptr; > @@ -2259,16 +2267,37 @@ int cipso_v4_skbuff_delattr(struct sk_buff *skb) > if (ret_val < 0) > return ret_val; > > - /* the easiest thing to do is just replace the cipso option with noop > - * options since we don't change the size of the packet, although we > - * still need to recalculate the checksum */ Unless you can guarantee that the length change isn't going to have any adverse effects (even then I would want to know why you are so confident), I'd feel a lot more comfortable sticking with a preserve-the-size-and-fill approach. If you want to change that from _NOP to _END, I'd be okay with that. (and if you are talking to who I think you are talking to, I'm guessing the _NOP to _END swap would likely solve their problem) > iph = ip_hdr(skb); > cipso_ptr = (unsigned char *)iph + opt->cipso; > - memset(cipso_ptr, IPOPT_NOOP, cipso_ptr[1]); > + cipso_len = cipso_ptr[1]; > + > + hdr_len_actual = sizeof(struct iphdr) + > + cipso_v4_get_actual_opt_len((unsigned char *)(iph + 1), > + opt->optlen); > + new_hdr_len_actual = hdr_len_actual - cipso_len; > + new_hdr_len = (new_hdr_len_actual + 3) & ~3; > + hdr_len_delta = (iph->ihl << 2) - new_hdr_len; > + > + /* 1. shift any options after CIPSO to the left */ > + memmove(cipso_ptr, cipso_ptr + cipso_len, > + new_hdr_len_actual - opt->cipso); > + /* 2. move the whole IP header to its new place */ > + memmove((unsigned char *)iph + hdr_len_delta, iph, new_hdr_len_actual); > + /* 3. adjust the skb layout */ > + skb_pull(skb, hdr_len_delta); > + skb_reset_network_header(skb); > + iph = ip_hdr(skb); > + /* 4. re-fill new padding with IPOPT_END (may now be longer) */ > + memset((unsigned char *)iph + new_hdr_len_actual, IPOPT_END, > + new_hdr_len - new_hdr_len_actual); > + opt->optlen -= hdr_len_delta; > opt->cipso = 0; > opt->is_changed = 1; > - > + if (hdr_len_delta != 0) { > + iph->ihl = new_hdr_len >> 2; > + iph_set_totlen(iph, skb->len); > + } > ip_send_check(iph); > > return 0; > -- > 2.44.0 -- paul-moore.com
On Tue, Apr 16, 2024 at 8:39 PM Paul Moore <paul@paul-moore.com> wrote: > > On Apr 16, 2024 Ondrej Mosnacek <omosnace@redhat.com> wrote: > > > > As the comment in this function says, the code currently just clears the > > CIPSO part with IPOPT_NOP, rather than removing it completely and > > trimming the packet. This is inconsistent with the other > > cipso_v4_*_delattr() functions and with CALIPSO (IPv6). > > This sentence above implies an equality in handling between those three > cases that doesn't exist. IPv6 has a radically different approach to > IP options, comparisions between the two aren't really valid. I don't think it's that radically different. Look at calipso_skbuff_delattr() and you will see that it already does something very similar as cipso_v4_skbuff_delattr() after this patch. If the CALIPSO options are the only thing in the hopopt header, it removes it altogether. If there are other options, it removes the part with the CALIPSO options - here it is less pedantic and replaces it with up to 7-byte padding if the length of the CALIPSO part is not 8-byte aligned, presumably to avoid having to move the subsequent options (if any) and adjust the padding at the end. > Similarly, > how we manage IPv4 options on sockets (req or otherwise) is pretty > different from how we manage them on packets, and that is intentional. Perhaps, but that is an implementation detail to the user. The resulting packet content should ideally be the same regardless of which method of injecting the options is used, when possible. > Drop the above sentence, or provide a more detailed explanation of the > three aproaches explaining when they can be compared and when they > shouldn't be compared. I'm not sure why you think they shouldn't be compared, but I can surely be more detailed in making my case there. > > Implement the proper option removal to make it consistent and producing > > more optimal IP packets when there are CIPSO options set. > > > > Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> > > --- > > net/ipv4/cipso_ipv4.c | 89 ++++++++++++++++++++++++++++--------------- > > 1 file changed, 59 insertions(+), 30 deletions(-) > > Outside of the SELinux test suite, what testing have you done when you > have a Linux box forwarding between a CIPSO network segment and an > unlabeled segment? I'm specifically interested in stream based protocols > such as TCP. Also, do the rest of the netfilter callbacks handle it okay > if the skb changes size in one of the callbacks? Granted it has been > *years* since this code was written (decades?), but if I recall > correctly, at the time it was a big no-no to change the skb size in a > netfilter callback. I didn't test that, TBH. But all of cipso_v4_skbuff_setattr(), calipso_skbuff_setattr(), and calipso_skbuff_delattr() already do skb_push()/skb_pull(), so they would all be broken if that is (still?) true. And this new cipso_v4_skbuff_delattr() doesn't do anything w.r.t. the skb and the IP header that those wouldn't do already. [...] > > @@ -2246,7 +2253,8 @@ int cipso_v4_skbuff_setattr(struct sk_buff *skb, > > */ > > int cipso_v4_skbuff_delattr(struct sk_buff *skb) > > { > > - int ret_val; > > + int ret_val, cipso_len, hdr_len_actual, new_hdr_len_actual, new_hdr_len, > > + hdr_len_delta; > > Please keep line lengths under 80-chars whenever possible. I know Linus > relaxed that requirement a while ago, but I still find the 80-char limit > to be a positive thing. I believe the line is exactly 80 characters, so still within the limit. Or is it < 80 instead of <= 80? Does it really matter? > > > struct iphdr *iph; > > struct ip_options *opt = &IPCB(skb)->opt; > > unsigned char *cipso_ptr; > > @@ -2259,16 +2267,37 @@ int cipso_v4_skbuff_delattr(struct sk_buff *skb) > > if (ret_val < 0) > > return ret_val; > > > > - /* the easiest thing to do is just replace the cipso option with noop > > - * options since we don't change the size of the packet, although we > > - * still need to recalculate the checksum */ > > Unless you can guarantee that the length change isn't going to have > any adverse effects (even then I would want to know why you are so > confident), I'd feel a lot more comfortable sticking with a > preserve-the-size-and-fill approach. If you want to change that from > _NOP to _END, I'd be okay with that. > > (and if you are talking to who I think you are talking to, I'm guessing > the _NOP to _END swap would likely solve their problem) So, to reveal all the cards, the issue that has prompted me to send this patch is that a user may have a router configured to drop packets containing any IP options [1][2] and may expect a Linux host with NetLabel configured as unlabeled for a target IP address to output/forward packets without IP options if CIPSO was the only option present before NetLabel processing (such that it can then pass through the strict router(s)). Padding with IPOPT_END *might* solve this particular case, but I'm not sure if the routers would really interpret such padding as "no options"... I'll try to ask the reporter to test such a patch and we'll see. But still, I don't yet see a convincing reason to not go all the way and make sure we send optimally-sized packets here. Side note: We could only clear CIPSO with IPOPT_END if it's the last option in the header, but that limitation doesn't really matter for the use case described above. [1] https://www.stigviewer.com/stig/juniper_router_rtr/2019-09-27/finding/V-90937 [2] https://www.stigviewer.com/stig/cisco_ios_xe_router_rtr/2021-03-26/finding/V-217001 > > > iph = ip_hdr(skb); > > cipso_ptr = (unsigned char *)iph + opt->cipso; > > - memset(cipso_ptr, IPOPT_NOOP, cipso_ptr[1]); > > + cipso_len = cipso_ptr[1]; > > + > > + hdr_len_actual = sizeof(struct iphdr) + > > + cipso_v4_get_actual_opt_len((unsigned char *)(iph + 1), > > + opt->optlen); > > + new_hdr_len_actual = hdr_len_actual - cipso_len; > > + new_hdr_len = (new_hdr_len_actual + 3) & ~3; > > + hdr_len_delta = (iph->ihl << 2) - new_hdr_len; > > + > > + /* 1. shift any options after CIPSO to the left */ > > + memmove(cipso_ptr, cipso_ptr + cipso_len, > > + new_hdr_len_actual - opt->cipso); > > + /* 2. move the whole IP header to its new place */ > > + memmove((unsigned char *)iph + hdr_len_delta, iph, new_hdr_len_actual); > > + /* 3. adjust the skb layout */ > > + skb_pull(skb, hdr_len_delta); > > + skb_reset_network_header(skb); > > + iph = ip_hdr(skb); > > + /* 4. re-fill new padding with IPOPT_END (may now be longer) */ > > + memset((unsigned char *)iph + new_hdr_len_actual, IPOPT_END, > > + new_hdr_len - new_hdr_len_actual); > > + opt->optlen -= hdr_len_delta; > > opt->cipso = 0; > > opt->is_changed = 1; > > - > > + if (hdr_len_delta != 0) { > > + iph->ihl = new_hdr_len >> 2; > > + iph_set_totlen(iph, skb->len); > > + } > > ip_send_check(iph); > > > > return 0; > > -- > > 2.44.0 > > -- > paul-moore.com > -- Ondrej Mosnacek Senior Software Engineer, Linux Security - SELinux kernel Red Hat, Inc.
On Wed, Apr 17, 2024 at 9:03 AM Ondrej Mosnacek <omosnace@redhat.com> wrote: > On Tue, Apr 16, 2024 at 8:39 PM Paul Moore <paul@paul-moore.com> wrote: > > On Apr 16, 2024 Ondrej Mosnacek <omosnace@redhat.com> wrote: > > > > > > As the comment in this function says, the code currently just clears the > > > CIPSO part with IPOPT_NOP, rather than removing it completely and > > > trimming the packet. This is inconsistent with the other > > > cipso_v4_*_delattr() functions and with CALIPSO (IPv6). > > > > This sentence above implies an equality in handling between those three > > cases that doesn't exist. IPv6 has a radically different approach to > > IP options, comparisions between the two aren't really valid. > > I don't think it's that radically different. They are very different in my mind. The IPv4 vs IPv6 option format and handling should be fairly obvious and I'm sure there are plenty of things written that describe the differences and motivations in excruciating detail so I'm not going to bother trying to do that here; as usual, Google is your friend. I will admit that the skbuff vs socket-based labeling differences are a bit more subtle, but I believe if you look at how the packets are labeled in the two approaches as well as how they are managed and hooked into the LSMs you will start to get a better idea. If that doesn't convince you that these three cases are significantly different, I'm not sure what else I can say other than we have a difference of opinion. Regardless, I stand by my original comment that I don't like the text you chose and would like you to remove or change it. > > > Implement the proper option removal to make it consistent and producing > > > more optimal IP packets when there are CIPSO options set. > > > > > > Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> > > > --- > > > net/ipv4/cipso_ipv4.c | 89 ++++++++++++++++++++++++++++--------------- > > > 1 file changed, 59 insertions(+), 30 deletions(-) > > > > Outside of the SELinux test suite, what testing have you done when you > > have a Linux box forwarding between a CIPSO network segment and an > > unlabeled segment? I'm specifically interested in stream based protocols > > such as TCP. Also, do the rest of the netfilter callbacks handle it okay > > if the skb changes size in one of the callbacks? Granted it has been > > *years* since this code was written (decades?), but if I recall > > correctly, at the time it was a big no-no to change the skb size in a > > netfilter callback. > > I didn't test that, TBH. But all of cipso_v4_skbuff_setattr(), > calipso_skbuff_setattr(), and calipso_skbuff_delattr() already do > skb_push()/skb_pull(), so they would all be broken if that is (still?) > true. And this new cipso_v4_skbuff_delattr() doesn't do anything > w.r.t. the skb and the IP header that those wouldn't do already. Fair point on skbuff size changes in netfilter and cipso_v4_skbuff_setattr(), that wasn't part of the original NetLabel/CIPSO support and I forgot about that aspect. On the other hand, I believe cipso_v4_skbuff_delattr() was part of the original work and used the NOOP hack both to preserve the packet length in the netfilter chain and to help ensure a consistent IP header overhead on both sides of a forwarding CIPSO<->unlabeled labeling/access control system. Which brings me around to the reason why I asked about testing; I think we need to confirm that nothing bad happens to bidirectional stream-based connections, e.g. TCP, when crossing over a CIPSO/unlabeled network boundary and the IP overhead changes. It's probably okay, but I would like to see that you've tested it with a couple different client OSes and everything works as expected. > [...] > > > @@ -2246,7 +2253,8 @@ int cipso_v4_skbuff_setattr(struct sk_buff *skb, > > > */ > > > int cipso_v4_skbuff_delattr(struct sk_buff *skb) > > > { > > > - int ret_val; > > > + int ret_val, cipso_len, hdr_len_actual, new_hdr_len_actual, new_hdr_len, > > > + hdr_len_delta; > > > > Please keep line lengths under 80-chars whenever possible. I know Linus > > relaxed that requirement a while ago, but I still find the 80-char limit > > to be a positive thing. > > I believe the line is exactly 80 characters, so still within the > limit. Or is it < 80 instead of <= 80? Does it really matter? I thought I saw it wrap on my terminal when reviewing the code, maybe it was just the newlink wrapping that I saw. As long as it is <= 80 I'm okay with it. > > > > > struct iphdr *iph; > > > struct ip_options *opt = &IPCB(skb)->opt; > > > unsigned char *cipso_ptr; > > > @@ -2259,16 +2267,37 @@ int cipso_v4_skbuff_delattr(struct sk_buff *skb) > > > if (ret_val < 0) > > > return ret_val; > > > > > > - /* the easiest thing to do is just replace the cipso option with noop > > > - * options since we don't change the size of the packet, although we > > > - * still need to recalculate the checksum */ > > > > Unless you can guarantee that the length change isn't going to have > > any adverse effects (even then I would want to know why you are so > > confident), I'd feel a lot more comfortable sticking with a > > preserve-the-size-and-fill approach. If you want to change that from > > _NOP to _END, I'd be okay with that. > > > > (and if you are talking to who I think you are talking to, I'm guessing > > the _NOP to _END swap would likely solve their problem) > > So, to reveal all the cards, the issue that has prompted me to send > this patch is that a user may have a router configured to drop packets > containing any IP options [1][2] and may expect a Linux host with > NetLabel configured as unlabeled for a target IP address to > output/forward packets without IP options if CIPSO was the only option > present before NetLabel processing (such that it can then pass through > the strict router(s)). > > Padding with IPOPT_END *might* solve this particular case, but I'm not > sure if the routers would really interpret such padding as "no > options"... I'll try to ask the reporter to test such a patch and > we'll see. But still, I don't yet see a convincing reason to not go > all the way and make sure we send optimally-sized packets here. I'm about 99.5% certain we are talking about the same reporter, although I was missing the detail about an intermediate switching/routing node; the problem report I saw was that there was simply a black-box device on the network that was dropping packets due to the presence of NOOP options. IMO, the original RFCs are a little too vague in this area, but it doesn't really matter if an intermediate node is dropping the packets. > Side note: We could only clear CIPSO with IPOPT_END if it's the last > option in the header ... Obviously, I kinda assumed anyone following along would understand that. > ... but that limitation doesn't really matter for > the use case described above.
On Thu, Apr 25, 2024 at 11:48 PM Paul Moore <paul@paul-moore.com> wrote: > > On Wed, Apr 17, 2024 at 9:03 AM Ondrej Mosnacek <omosnace@redhat.com> wrote: > > On Tue, Apr 16, 2024 at 8:39 PM Paul Moore <paul@paul-moore.com> wrote: > > > On Apr 16, 2024 Ondrej Mosnacek <omosnace@redhat.com> wrote: > > > > > > > > As the comment in this function says, the code currently just clears the > > > > CIPSO part with IPOPT_NOP, rather than removing it completely and > > > > trimming the packet. This is inconsistent with the other > > > > cipso_v4_*_delattr() functions and with CALIPSO (IPv6). > > > > > > This sentence above implies an equality in handling between those three > > > cases that doesn't exist. IPv6 has a radically different approach to > > > IP options, comparisions between the two aren't really valid. > > > > I don't think it's that radically different. > > They are very different in my mind. The IPv4 vs IPv6 option format > and handling should be fairly obvious and I'm sure there are plenty of > things written that describe the differences and motivations in > excruciating detail so I'm not going to bother trying to do that here; > as usual, Google is your friend. I will admit that the skbuff vs > socket-based labeling differences are a bit more subtle, but I believe > if you look at how the packets are labeled in the two approaches as > well as how they are managed and hooked into the LSMs you will start > to get a better idea. If that doesn't convince you that these three > cases are significantly different, I'm not sure what else I can say > other than we have a difference of opinion. Regardless, I stand by my > original comment that I don't like the text you chose and would like > you to remove or change it. Ok, I amended this part for v2 to hopefully better express what I'm alluding to. I also added a paragraph about the routers dropping packets with IP options, which explains the motivation better, anyway. > > > > Implement the proper option removal to make it consistent and producing > > > > more optimal IP packets when there are CIPSO options set. > > > > > > > > Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> > > > > --- > > > > net/ipv4/cipso_ipv4.c | 89 ++++++++++++++++++++++++++++--------------- > > > > 1 file changed, 59 insertions(+), 30 deletions(-) > > > > > > Outside of the SELinux test suite, what testing have you done when you > > > have a Linux box forwarding between a CIPSO network segment and an > > > unlabeled segment? I'm specifically interested in stream based protocols > > > such as TCP. Also, do the rest of the netfilter callbacks handle it okay > > > if the skb changes size in one of the callbacks? Granted it has been > > > *years* since this code was written (decades?), but if I recall > > > correctly, at the time it was a big no-no to change the skb size in a > > > netfilter callback. > > > > I didn't test that, TBH. But all of cipso_v4_skbuff_setattr(), > > calipso_skbuff_setattr(), and calipso_skbuff_delattr() already do > > skb_push()/skb_pull(), so they would all be broken if that is (still?) > > true. And this new cipso_v4_skbuff_delattr() doesn't do anything > > w.r.t. the skb and the IP header that those wouldn't do already. > > Fair point on skbuff size changes in netfilter and > cipso_v4_skbuff_setattr(), that wasn't part of the original > NetLabel/CIPSO support and I forgot about that aspect. On the other > hand, I believe cipso_v4_skbuff_delattr() was part of the original > work and used the NOOP hack both to preserve the packet length in the > netfilter chain and to help ensure a consistent IP header overhead on > both sides of a forwarding CIPSO<->unlabeled labeling/access control > system. Which brings me around to the reason why I asked about > testing; I think we need to confirm that nothing bad happens to > bidirectional stream-based connections, e.g. TCP, when crossing over a > CIPSO/unlabeled network boundary and the IP overhead changes. It's > probably okay, but I would like to see that you've tested it with a > couple different client OSes and everything works as expected. I tried to test what you describe - hopefully I got close enough: My test setup has 3 VMs (running Fedora 39 from the stock qcow2 image) A, B, and R, connected via separate links as A <--> R <--> B, where R acts as a router between A and B (net.ipv4.ip_forward is set to 1 on R, and the appropriate routes are set on A, B, R). The A <--> R link has subnet 10.123.123.0/24, A having address 10.123.123.2 and R having 10.123.123.1. The B <--> R link has subnet 10.123.124.0/24, B having address 10.123.124.2 and R having 10.123.124.1. The links are implemented as GRE tunnels over the main network that is shared between the VMs. Netlabel configuration on A: netlabelctl cipsov4 add pass doi:16 tags:5 netlabelctl map del default netlabelctl map add default address:0.0.0.0/0 protocol:unlbl netlabelctl map add default address:::/0 protocol:unlbl netlabelctl map add default address:10.123.123.0/24 protocol:cipsov4,16 netlabelctl map add default address:10.123.124.0/24 protocol:cipsov4,16 Netlabel configuration on R: netlabelctl cipsov4 add pass doi:16 tags:5 netlabelctl map del default netlabelctl map add default address:0.0.0.0/0 protocol:unlbl netlabelctl map add default address:::/0 protocol:unlbl netlabelctl map add default address:10.123.123.0/24 protocol:cipsov4,16 B has no netlabel configured. (I.e. A tries to send CIPSO-labeled packets to B, but R treats the 10.123.124.0/24 network as unlabeled, so should strip/add the CIPSO label when forwarding between A and B.) A basic TCP connection worked just fine in both directions with and without these patches applied (I installed the patched kernel on all machines, though it should only matter on machine R). I ignored the actual labels/CIPSO content - i.e. I didn't change the default SELinux policy and put SELinux into permissive mode on machines A and R. Capturing the packets on R showed the following IP option content without the patches: A -> R: CIPSO R -> B: NOPs B -> R: (empty) R -> A: CIPSO With the patches this changed to: A -> R: CIPSO R -> B: (empty) B -> R: (empty) R -> A: CIPSO Is this convincing enough or do you have different scenarios in mind?
On Tue, May 14, 2024 at 7:29 AM Ondrej Mosnacek <omosnace@redhat.com> wrote: > On Thu, Apr 25, 2024 at 11:48 PM Paul Moore <paul@paul-moore.com> wrote: > > > > On Wed, Apr 17, 2024 at 9:03 AM Ondrej Mosnacek <omosnace@redhat.com> wrote: > > > On Tue, Apr 16, 2024 at 8:39 PM Paul Moore <paul@paul-moore.com> wrote: > > > > On Apr 16, 2024 Ondrej Mosnacek <omosnace@redhat.com> wrote: > > > > > > > > > > As the comment in this function says, the code currently just clears the > > > > > CIPSO part with IPOPT_NOP, rather than removing it completely and > > > > > trimming the packet. This is inconsistent with the other > > > > > cipso_v4_*_delattr() functions and with CALIPSO (IPv6). > > > > > > > > This sentence above implies an equality in handling between those three > > > > cases that doesn't exist. IPv6 has a radically different approach to > > > > IP options, comparisions between the two aren't really valid. > > > > > > I don't think it's that radically different. > > > > They are very different in my mind. The IPv4 vs IPv6 option format > > and handling should be fairly obvious and I'm sure there are plenty of > > things written that describe the differences and motivations in > > excruciating detail so I'm not going to bother trying to do that here; > > as usual, Google is your friend. I will admit that the skbuff vs > > socket-based labeling differences are a bit more subtle, but I believe > > if you look at how the packets are labeled in the two approaches as > > well as how they are managed and hooked into the LSMs you will start > > to get a better idea. If that doesn't convince you that these three > > cases are significantly different, I'm not sure what else I can say > > other than we have a difference of opinion. Regardless, I stand by my > > original comment that I don't like the text you chose and would like > > you to remove or change it. > > Ok, I amended this part for v2 to hopefully better express what I'm > alluding to. I also added a paragraph about the routers dropping > packets with IP options, which explains the motivation better, anyway. Okay, I'll refrain from further comment until I see the v2 patch. > I tried to test what you describe - hopefully I got close enough: > > My test setup has 3 VMs (running Fedora 39 from the stock qcow2 image) > A, B, and R, connected via separate links as A <--> R <--> B, where R > acts as a router between A and B (net.ipv4.ip_forward is set to 1 on > R, and the appropriate routes are set on A, B, R). > > The A <--> R link has subnet 10.123.123.0/24, A having address > 10.123.123.2 and R having 10.123.123.1. > The B <--> R link has subnet 10.123.124.0/24, B having address > 10.123.124.2 and R having 10.123.124.1. > > The links are implemented as GRE tunnels over the main network that is > shared between the VMs. > > Netlabel configuration on A: > netlabelctl cipsov4 add pass doi:16 tags:5 > netlabelctl map del default > netlabelctl map add default address:0.0.0.0/0 protocol:unlbl > netlabelctl map add default address:::/0 protocol:unlbl > netlabelctl map add default address:10.123.123.0/24 protocol:cipsov4,16 > netlabelctl map add default address:10.123.124.0/24 protocol:cipsov4,16 > > Netlabel configuration on R: > netlabelctl cipsov4 add pass doi:16 tags:5 > netlabelctl map del default > netlabelctl map add default address:0.0.0.0/0 protocol:unlbl > netlabelctl map add default address:::/0 protocol:unlbl > netlabelctl map add default address:10.123.123.0/24 protocol:cipsov4,16 > > B has no netlabel configured. > > (I.e. A tries to send CIPSO-labeled packets to B, but R treats the > 10.123.124.0/24 network as unlabeled, so should strip/add the CIPSO > label when forwarding between A and B.) > > A basic TCP connection worked just fine in both directions with and > without these patches applied (I installed the patched kernel on all > machines, though it should only matter on machine R). I ignored the > actual labels/CIPSO content - i.e. I didn't change the default SELinux > policy and put SELinux into permissive mode on machines A and R. > > Capturing the packets on R showed the following IP option content > without the patches: > A -> R: CIPSO > R -> B: NOPs > B -> R: (empty) > R -> A: CIPSO > > With the patches this changed to: > A -> R: CIPSO > R -> B: (empty) > B -> R: (empty) > R -> A: CIPSO > > Is this convincing enough or do you have different scenarios in mind? Thanks for verifying your patch, the methodology looks good to me, but as I mentioned in my previous email I would feel much better if you verified this with a different client OS/stack. Do you have access to Windows/MacOS/BSD/non-Linux system you could use in place of B in your test above?
On Fri, May 17, 2024 at 9:49 PM Paul Moore <paul@paul-moore.com> wrote: > > On Tue, May 14, 2024 at 7:29 AM Ondrej Mosnacek <omosnace@redhat.com> wrote: > > I tried to test what you describe - hopefully I got close enough: > > > > My test setup has 3 VMs (running Fedora 39 from the stock qcow2 image) > > A, B, and R, connected via separate links as A <--> R <--> B, where R > > acts as a router between A and B (net.ipv4.ip_forward is set to 1 on > > R, and the appropriate routes are set on A, B, R). > > > > The A <--> R link has subnet 10.123.123.0/24, A having address > > 10.123.123.2 and R having 10.123.123.1. > > The B <--> R link has subnet 10.123.124.0/24, B having address > > 10.123.124.2 and R having 10.123.124.1. > > > > The links are implemented as GRE tunnels over the main network that is > > shared between the VMs. > > > > Netlabel configuration on A: > > netlabelctl cipsov4 add pass doi:16 tags:5 > > netlabelctl map del default > > netlabelctl map add default address:0.0.0.0/0 protocol:unlbl > > netlabelctl map add default address:::/0 protocol:unlbl > > netlabelctl map add default address:10.123.123.0/24 protocol:cipsov4,16 > > netlabelctl map add default address:10.123.124.0/24 protocol:cipsov4,16 > > > > Netlabel configuration on R: > > netlabelctl cipsov4 add pass doi:16 tags:5 > > netlabelctl map del default > > netlabelctl map add default address:0.0.0.0/0 protocol:unlbl > > netlabelctl map add default address:::/0 protocol:unlbl > > netlabelctl map add default address:10.123.123.0/24 protocol:cipsov4,16 > > > > B has no netlabel configured. > > > > (I.e. A tries to send CIPSO-labeled packets to B, but R treats the > > 10.123.124.0/24 network as unlabeled, so should strip/add the CIPSO > > label when forwarding between A and B.) > > > > A basic TCP connection worked just fine in both directions with and > > without these patches applied (I installed the patched kernel on all > > machines, though it should only matter on machine R). I ignored the > > actual labels/CIPSO content - i.e. I didn't change the default SELinux > > policy and put SELinux into permissive mode on machines A and R. > > > > Capturing the packets on R showed the following IP option content > > without the patches: > > A -> R: CIPSO > > R -> B: NOPs > > B -> R: (empty) > > R -> A: CIPSO > > > > With the patches this changed to: > > A -> R: CIPSO > > R -> B: (empty) > > B -> R: (empty) > > R -> A: CIPSO > > > > Is this convincing enough or do you have different scenarios in mind? > > Thanks for verifying your patch, the methodology looks good to me, but > as I mentioned in my previous email I would feel much better if you > verified this with a different client OS/stack. Do you have access to > Windows/MacOS/BSD/non-Linux system you could use in place of B in your > test above? I don't think I can easily plug that into the framework I used for the testing (there doesn't seem to be a convenient way to install a FreeBSD VM without manual interaction and the rest is proprietary). I still don't quite understand what exactly you expect to break under that scenario and why - could you elaborate on that? If anything, I'd expect the IP header growing along the path (which already happens pretty much by design in the opposite direction) to be more likely to cause an issue. -- Ondrej Mosnacek Senior Software Engineer, Linux Security - SELinux kernel Red Hat, Inc.
On Fri, Jun 7, 2024 at 11:47 AM Ondrej Mosnacek <omosnace@redhat.com> wrote: > On Fri, May 17, 2024 at 9:49 PM Paul Moore <paul@paul-moore.com> wrote: > > > > On Tue, May 14, 2024 at 7:29 AM Ondrej Mosnacek <omosnace@redhat.com> wrote: > > > I tried to test what you describe - hopefully I got close enough: > > > > > > My test setup has 3 VMs (running Fedora 39 from the stock qcow2 image) > > > A, B, and R, connected via separate links as A <--> R <--> B, where R > > > acts as a router between A and B (net.ipv4.ip_forward is set to 1 on > > > R, and the appropriate routes are set on A, B, R). > > > > > > The A <--> R link has subnet 10.123.123.0/24, A having address > > > 10.123.123.2 and R having 10.123.123.1. > > > The B <--> R link has subnet 10.123.124.0/24, B having address > > > 10.123.124.2 and R having 10.123.124.1. > > > > > > The links are implemented as GRE tunnels over the main network that is > > > shared between the VMs. > > > > > > Netlabel configuration on A: > > > netlabelctl cipsov4 add pass doi:16 tags:5 > > > netlabelctl map del default > > > netlabelctl map add default address:0.0.0.0/0 protocol:unlbl > > > netlabelctl map add default address:::/0 protocol:unlbl > > > netlabelctl map add default address:10.123.123.0/24 protocol:cipsov4,16 > > > netlabelctl map add default address:10.123.124.0/24 protocol:cipsov4,16 > > > > > > Netlabel configuration on R: > > > netlabelctl cipsov4 add pass doi:16 tags:5 > > > netlabelctl map del default > > > netlabelctl map add default address:0.0.0.0/0 protocol:unlbl > > > netlabelctl map add default address:::/0 protocol:unlbl > > > netlabelctl map add default address:10.123.123.0/24 protocol:cipsov4,16 > > > > > > B has no netlabel configured. > > > > > > (I.e. A tries to send CIPSO-labeled packets to B, but R treats the > > > 10.123.124.0/24 network as unlabeled, so should strip/add the CIPSO > > > label when forwarding between A and B.) > > > > > > A basic TCP connection worked just fine in both directions with and > > > without these patches applied (I installed the patched kernel on all > > > machines, though it should only matter on machine R). I ignored the > > > actual labels/CIPSO content - i.e. I didn't change the default SELinux > > > policy and put SELinux into permissive mode on machines A and R. > > > > > > Capturing the packets on R showed the following IP option content > > > without the patches: > > > A -> R: CIPSO > > > R -> B: NOPs > > > B -> R: (empty) > > > R -> A: CIPSO > > > > > > With the patches this changed to: > > > A -> R: CIPSO > > > R -> B: (empty) > > > B -> R: (empty) > > > R -> A: CIPSO > > > > > > Is this convincing enough or do you have different scenarios in mind? > > > > Thanks for verifying your patch, the methodology looks good to me, but > > as I mentioned in my previous email I would feel much better if you > > verified this with a different client OS/stack. Do you have access to > > Windows/MacOS/BSD/non-Linux system you could use in place of B in your > > test above? > > I don't think I can easily plug that into the framework I used for the > testing (there doesn't seem to be a convenient way to install a > FreeBSD VM without manual interaction and the rest is proprietary). Surely you can perform a manual unit test with some VMs on your local machine if whatever test automation you are using doesn't support this. > I still don't quite understand what exactly you expect to break under > that scenario and why - could you elaborate on that? If anything, I'd > expect the IP header growing along the path (which already happens > pretty much by design in the opposite direction) to be more likely to > cause an issue. I'm concerned about potential oddities caused by the changes in IP header sizes while the packet is in flight. Every OS's network stack is a bit different and I don't think it is too much to ask to test at least one non-Linux network stack as a client.
diff --git a/net/ipv4/cipso_ipv4.c b/net/ipv4/cipso_ipv4.c index 75b5e3c35f9bf..c08c6d0262ba8 100644 --- a/net/ipv4/cipso_ipv4.c +++ b/net/ipv4/cipso_ipv4.c @@ -1810,6 +1810,34 @@ static int cipso_v4_genopt(unsigned char *buf, u32 buf_len, return CIPSO_V4_HDR_LEN + ret_val; } +static int cipso_v4_get_actual_opt_len(const unsigned char *data, int len) +{ + int iter = 0, optlen = 0; + + /* determining the new total option length is tricky because of + * the padding necessary, the only thing i can think to do at + * this point is walk the options one-by-one, skipping the + * padding at the end to determine the actual option size and + * from there we can determine the new total option length + */ + while (iter < len) { + if (data[iter] == IPOPT_END) { + break; + } else if (data[iter] == IPOPT_NOP) { + iter++; + } else { + if (WARN_ON(data[iter + 1] < 2)) + iter += 2; + else + iter += data[iter + 1]; + optlen = iter; + } + } + if (WARN_ON(optlen > len)) + optlen = len; + return optlen; +} + /** * cipso_v4_sock_setattr - Add a CIPSO option to a socket * @sk: the socket @@ -1985,7 +2013,6 @@ static int cipso_v4_delopt(struct ip_options_rcu __rcu **opt_ptr) u8 cipso_len; u8 cipso_off; unsigned char *cipso_ptr; - int iter; int optlen_new; cipso_off = opt->opt.cipso - sizeof(struct iphdr); @@ -2005,28 +2032,8 @@ static int cipso_v4_delopt(struct ip_options_rcu __rcu **opt_ptr) memmove(cipso_ptr, cipso_ptr + cipso_len, opt->opt.optlen - cipso_off - cipso_len); - /* determining the new total option length is tricky because of - * the padding necessary, the only thing i can think to do at - * this point is walk the options one-by-one, skipping the - * padding at the end to determine the actual option size and - * from there we can determine the new total option length */ - iter = 0; - optlen_new = 0; - while (iter < opt->opt.optlen) { - if (opt->opt.__data[iter] == IPOPT_END) { - break; - } else if (opt->opt.__data[iter] == IPOPT_NOP) { - iter++; - } else { - if (WARN_ON(opt->opt.__data[iter + 1] < 2)) - iter += 2; - else - iter += opt->opt.__data[iter + 1]; - optlen_new = iter; - } - } - if (WARN_ON(optlen_new > opt->opt.optlen)) - optlen_new = opt->opt.optlen; + optlen_new = cipso_v4_get_actual_opt_len(opt->opt.__data, + opt->opt.optlen); hdr_delta = opt->opt.optlen; opt->opt.optlen = (optlen_new + 3) & ~3; hdr_delta -= opt->opt.optlen; @@ -2246,7 +2253,8 @@ int cipso_v4_skbuff_setattr(struct sk_buff *skb, */ int cipso_v4_skbuff_delattr(struct sk_buff *skb) { - int ret_val; + int ret_val, cipso_len, hdr_len_actual, new_hdr_len_actual, new_hdr_len, + hdr_len_delta; struct iphdr *iph; struct ip_options *opt = &IPCB(skb)->opt; unsigned char *cipso_ptr; @@ -2259,16 +2267,37 @@ int cipso_v4_skbuff_delattr(struct sk_buff *skb) if (ret_val < 0) return ret_val; - /* the easiest thing to do is just replace the cipso option with noop - * options since we don't change the size of the packet, although we - * still need to recalculate the checksum */ - iph = ip_hdr(skb); cipso_ptr = (unsigned char *)iph + opt->cipso; - memset(cipso_ptr, IPOPT_NOOP, cipso_ptr[1]); + cipso_len = cipso_ptr[1]; + + hdr_len_actual = sizeof(struct iphdr) + + cipso_v4_get_actual_opt_len((unsigned char *)(iph + 1), + opt->optlen); + new_hdr_len_actual = hdr_len_actual - cipso_len; + new_hdr_len = (new_hdr_len_actual + 3) & ~3; + hdr_len_delta = (iph->ihl << 2) - new_hdr_len; + + /* 1. shift any options after CIPSO to the left */ + memmove(cipso_ptr, cipso_ptr + cipso_len, + new_hdr_len_actual - opt->cipso); + /* 2. move the whole IP header to its new place */ + memmove((unsigned char *)iph + hdr_len_delta, iph, new_hdr_len_actual); + /* 3. adjust the skb layout */ + skb_pull(skb, hdr_len_delta); + skb_reset_network_header(skb); + iph = ip_hdr(skb); + /* 4. re-fill new padding with IPOPT_END (may now be longer) */ + memset((unsigned char *)iph + new_hdr_len_actual, IPOPT_END, + new_hdr_len - new_hdr_len_actual); + + opt->optlen -= hdr_len_delta; opt->cipso = 0; opt->is_changed = 1; - + if (hdr_len_delta != 0) { + iph->ihl = new_hdr_len >> 2; + iph_set_totlen(iph, skb->len); + } ip_send_check(iph); return 0;
As the comment in this function says, the code currently just clears the CIPSO part with IPOPT_NOP, rather than removing it completely and trimming the packet. This is inconsistent with the other cipso_v4_*_delattr() functions and with CALIPSO (IPv6). Implement the proper option removal to make it consistent and producing more optimal IP packets when there are CIPSO options set. Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> --- net/ipv4/cipso_ipv4.c | 89 ++++++++++++++++++++++++++++--------------- 1 file changed, 59 insertions(+), 30 deletions(-)