Message ID | 20250321164537.16719-1-bboscaccy@linux.microsoft.com (mailing list archive) |
---|---|
Headers | show |
Series | Introducing Hornet LSM | expand |
On Fri, Mar 21, 2025 at 12:45 PM Blaise Boscaccy <bboscaccy@linux.microsoft.com> wrote: > > This patch series introduces the Hornet LSM. > > Hornet takes a simple approach to light-skeleton-based eBPF signature > verification. Signature data can be easily generated for the binary > data that is generated via bpftool gen -L. This signature can be > appended to a skeleton executable via scripts/sign-ebpf. Hornet checks > the signature against a binary buffer containing the lskel > instructions that the loader maps use. Maps are frozen to prevent > TOCTOU bugs where a sufficiently privileged user could rewrite map > data between the calls to BPF_PROG_LOAD and > BPF_PROG_RUN. Additionally, both sparse-array-based and > fd_array_cnt-based map fd arrays are supported for signature > verification. > > Blaise Boscaccy (4): > security: Hornet LSM > hornet: Introduce sign-ebpf > hornet: Add an example lskel data extactor script > selftests/hornet: Add a selftest for the hornet LSM Thanks Blaise, I noticed a few minor things, but nothing critical. As I understand it, you'll be presenting Hornet at LSFMMBPF next week? Assuming that's the case, I'm going to hold off on reviewing this until we hear how that went next week; please report back after the conference. However, to be clear, the Hornet LSM proposed here seems very reasonable to me and I would have no conceptual objections to merging it upstream. Based on off-list discussions I believe there is a lot of demand for something like this, and I believe many people will be happy to have BPF signature verification in-tree.
On Fri, Mar 21, 2025 at 09:45:02AM -0700, Blaise Boscaccy wrote: > This patch series introduces the Hornet LSM. > > Hornet takes a simple approach to light-skeleton-based eBPF signature Can you define "light-skeleton-based" before using the term. This is the first time in my life when I hear about it. > verification. Signature data can be easily generated for the binary s/easily// Useless word having no measure. > data that is generated via bpftool gen -L. This signature can be I have no idea what that command does. "Signature data can be generated for the binary data as follows: bpftool gen -L <explanation>" Here you'd need to answer to couple of unknowns: 1. What is in exact terms "signature data"? 2. What does "bpftool gen -L" do? This feedback maps to other examples too in the cover letter. BR, Jarkko
On Sat, Mar 22, 2025 at 1:22 PM Jarkko Sakkinen <jarkko@kernel.org> wrote: > On Fri, Mar 21, 2025 at 09:45:02AM -0700, Blaise Boscaccy wrote: > > This patch series introduces the Hornet LSM. > > > > Hornet takes a simple approach to light-skeleton-based eBPF signature > > Can you define "light-skeleton-based" before using the term. > > This is the first time in my life when I hear about it. I was in the same situation a few months ago when I first heard about it :) Blaise can surely provide a much better answer that what I'm about to write, but since Blaise is going to be at LSFMMBPF this coming week I suspect he might not have a lot of time to respond to email in the next few days so I thought I would do my best to try and answer :) An eBPF "light skeleton" is basically a BPF loader program and while I'm sure there are several uses for a light skeleton, or lskel for brevity, the single use case that we are interested in here, and the one that Hornet deals with, is the idea of using a lskel to enable signature verification of BPF programs as it seems to be the one way that has been deemed acceptable by the BPF maintainers. Once again, skipping over a lot of details, the basic idea is that you take your original BPF program (A), feed it into a BPF userspace tool to encapsulate the original program A into a BPF map and generate a corresponding light skeleton BPF program (B), and then finally sign the resulting binary containing the lskel program (B) and map corresponding to the original program A. At runtime, the lskel binary is loaded into the kernel, and if Hornet is enabled, the signature of both the lskel program A and original program B is verified. If the signature verification passes, lskel program A performs the necessary BPF CO-RE transforms on BPF program A stored in the BPF map and then attempts to load the original BPF program B, all from within the kernel, and with the map frozen to prevent tampering from userspace. Hopefully that helps fill in some gaps until someone more knowledgeable can provide a better answer and/or correct any mistakes in my explanation above ;)
On Sat, Mar 22, 2025 at 4:44 PM Paul Moore <paul@paul-moore.com> wrote: > > On Sat, Mar 22, 2025 at 1:22 PM Jarkko Sakkinen <jarkko@kernel.org> wrote: > > On Fri, Mar 21, 2025 at 09:45:02AM -0700, Blaise Boscaccy wrote: > > > This patch series introduces the Hornet LSM. > > > > > > Hornet takes a simple approach to light-skeleton-based eBPF signature > > > > Can you define "light-skeleton-based" before using the term. > > > > This is the first time in my life when I hear about it. > > I was in the same situation a few months ago when I first heard about it :) > > Blaise can surely provide a much better answer that what I'm about to > write, but since Blaise is going to be at LSFMMBPF this coming week I > suspect he might not have a lot of time to respond to email in the > next few days so I thought I would do my best to try and answer :) > > An eBPF "light skeleton" is basically a BPF loader program and while > I'm sure there are several uses for a light skeleton, or lskel for > brevity, the single use case that we are interested in here, and the > one that Hornet deals with, is the idea of using a lskel to enable > signature verification of BPF programs as it seems to be the one way > that has been deemed acceptable by the BPF maintainers. > > Once again, skipping over a lot of details, the basic idea is that you > take your original BPF program (A), feed it into a BPF userspace tool > to encapsulate the original program A into a BPF map and generate a > corresponding light skeleton BPF program (B), and then finally sign > the resulting binary containing the lskel program (B) and map > corresponding to the original program A. Forgive me, I mixed up my "A" and "B" above :/ > At runtime, the lskel binary > is loaded into the kernel, and if Hornet is enabled, the signature of > both the lskel program A and original program B is verified. ... and I did again here > If the > signature verification passes, lskel program A performs the necessary > BPF CO-RE transforms on BPF program A stored in the BPF map and then > attempts to load the original BPF program B, all from within the > kernel, and with the map frozen to prevent tampering from userspace. ... and once more here because why not? :) > Hopefully that helps fill in some gaps until someone more > knowledgeable can provide a better answer and/or correct any mistakes > in my explanation above ;)
On Sat, Mar 22, 2025 at 04:44:13PM -0400, Paul Moore wrote: > On Sat, Mar 22, 2025 at 1:22 PM Jarkko Sakkinen <jarkko@kernel.org> wrote: > > On Fri, Mar 21, 2025 at 09:45:02AM -0700, Blaise Boscaccy wrote: > > > This patch series introduces the Hornet LSM. > > > > > > Hornet takes a simple approach to light-skeleton-based eBPF signature > > > > Can you define "light-skeleton-based" before using the term. > > > > This is the first time in my life when I hear about it. > > I was in the same situation a few months ago when I first heard about it :) > > Blaise can surely provide a much better answer that what I'm about to > write, but since Blaise is going to be at LSFMMBPF this coming week I > suspect he might not have a lot of time to respond to email in the > next few days so I thought I would do my best to try and answer :) Yeah, I don't think there is anything largely wrong in the feature itself but it speaks language that would fit to eBPF subsystem list, not here :-) I.e. assume only very basic knowledge of eBPF and explain what stuff mentioned actually does. Like bpftool statement should be opened up fully. > > An eBPF "light skeleton" is basically a BPF loader program and while > I'm sure there are several uses for a light skeleton, or lskel for > brevity, the single use case that we are interested in here, and the > one that Hornet deals with, is the idea of using a lskel to enable > signature verification of BPF programs as it seems to be the one way > that has been deemed acceptable by the BPF maintainers. I got some grip but the term only should be used IMHO in the commit message, if it is defined at first :-) > > Once again, skipping over a lot of details, the basic idea is that you > take your original BPF program (A), feed it into a BPF userspace tool > to encapsulate the original program A into a BPF map and generate a > corresponding light skeleton BPF program (B), and then finally sign > the resulting binary containing the lskel program (B) and map > corresponding to the original program A. At runtime, the lskel binary > is loaded into the kernel, and if Hornet is enabled, the signature of > both the lskel program A and original program B is verified. If the > signature verification passes, lskel program A performs the necessary > BPF CO-RE transforms on BPF program A stored in the BPF map and then > attempts to load the original BPF program B, all from within the > kernel, and with the map frozen to prevent tampering from userspace. When you speak about corresponding lskel program what does that program contain? Is it some kind of new version of the same program with modifications, or? I neither did not know what BPF CO-RE is but I googled it ;-) > > Hopefully that helps fill in some gaps until someone more > knowledgeable can provide a better answer and/or correct any mistakes > in my explanation above ;) Sure... Thanks for the explanations! > > -- > paul-moore.com BR, Jarkko
On Sat, Mar 22, 2025 at 04:48:14PM -0400, Paul Moore wrote: > On Sat, Mar 22, 2025 at 4:44 PM Paul Moore <paul@paul-moore.com> wrote: > > > > On Sat, Mar 22, 2025 at 1:22 PM Jarkko Sakkinen <jarkko@kernel.org> wrote: > > > On Fri, Mar 21, 2025 at 09:45:02AM -0700, Blaise Boscaccy wrote: > > > > This patch series introduces the Hornet LSM. > > > > > > > > Hornet takes a simple approach to light-skeleton-based eBPF signature > > > > > > Can you define "light-skeleton-based" before using the term. > > > > > > This is the first time in my life when I hear about it. > > > > I was in the same situation a few months ago when I first heard about it :) > > > > Blaise can surely provide a much better answer that what I'm about to > > write, but since Blaise is going to be at LSFMMBPF this coming week I > > suspect he might not have a lot of time to respond to email in the > > next few days so I thought I would do my best to try and answer :) > > > > An eBPF "light skeleton" is basically a BPF loader program and while > > I'm sure there are several uses for a light skeleton, or lskel for > > brevity, the single use case that we are interested in here, and the > > one that Hornet deals with, is the idea of using a lskel to enable > > signature verification of BPF programs as it seems to be the one way > > that has been deemed acceptable by the BPF maintainers. > > > > Once again, skipping over a lot of details, the basic idea is that you > > take your original BPF program (A), feed it into a BPF userspace tool > > to encapsulate the original program A into a BPF map and generate a > > corresponding light skeleton BPF program (B), and then finally sign > > the resulting binary containing the lskel program (B) and map > > corresponding to the original program A. > > Forgive me, I mixed up my "A" and "B" above :/ > > > At runtime, the lskel binary > > is loaded into the kernel, and if Hornet is enabled, the signature of > > both the lskel program A and original program B is verified. > > ... and I did again here > > > If the > > signature verification passes, lskel program A performs the necessary > > BPF CO-RE transforms on BPF program A stored in the BPF map and then > > attempts to load the original BPF program B, all from within the > > kernel, and with the map frozen to prevent tampering from userspace. > > ... and once more here because why not? :) No worries I was able to decipher this :-) > > > Hopefully that helps fill in some gaps until someone more > > knowledgeable can provide a better answer and/or correct any mistakes > > in my explanation above ;) > > -- > paul-moore.com BR, Jarkko
Jarkko Sakkinen <jarkko@kernel.org> writes: Hi Jarkko, Thanks for the comments. Paul did a very nice job providing some background info, allow me to provide some additional data. > On Fri, Mar 21, 2025 at 09:45:02AM -0700, Blaise Boscaccy wrote: >> This patch series introduces the Hornet LSM. >> >> Hornet takes a simple approach to light-skeleton-based eBPF signature > > Can you define "light-skeleton-based" before using the term. > > This is the first time in my life when I hear about it. > Sure. Here is the patchset where this stuff got introduced if you are curious. https://lore.kernel.org/bpf/20220209054315.73833-1-alexei.starovoitov@gmail.com/ eBPF has similar requirements to that of modules when it comes to loading: find kallysym addresses, fix up elf relocations, some struct field offset handing stuff called CO-RE (compile-one run-anywhere), and some other miscellaneous bookkeeping. During eBPF program compilation, pseudo-values get written to the immedate operands of instructions. During loading, those pseudo-values get rewritten with concrete addresses or data applicable to the currently running system, e.g. a kallsym address or a fd for a map. This needs to happen before the instructions for a bpf program are loaded into the kernel via the bpf() syscall. Unlike modules, an in-kernel loader unfortunately doesn't exist. Typically, the instruction rewriting is done dynamically in userspace via libbpf (or the rust/go/python loader). What skeletons do is generate a script of required instruction-rewriting operations which then gets played back at load-time against a hard-coded blob of raw instruction data. This removes the need to distribute source-code or object files. There are two flavors of skeletons, normal skeletons, and light skeletons. Normal skeletons utilize relocation logic that lives in libbpf, and the relocations/instruction rewriting happen in userspace. The second flavor, light skeletons, uses a small eBPF program that contains the relocation lookup logic. As it's running in in the kernel, it unpacks the target program, peforms the instruction rewriting, and loads the target program. Light skeletons are currently utilized for some drivers, and BPF_PRELOAD functionionality since they can operate without userspace. Light skeletons were recommended on various mailing list discussions as the preffered path to performing signature verification. There are some PoCs floating around that used light-skeletons in concert with fs-verity/IMA and eBPF LSMs. We took a slightly different approach to Hornet, by utilizing the existing PCKS#7 signing scheme that is used for kernel modules. >> verification. Signature data can be easily generated for the binary > > s/easily// > > Useless word having no measure. > Ack, thanks. >> data that is generated via bpftool gen -L. This signature can be > > I have no idea what that command does. > > "Signature data can be generated for the binary data as follows: > > bpftool gen -L > > <explanation>" > > Here you'd need to answer to couple of unknowns: > > 1. What is in exact terms "signature data"? That is a PKCS#7 signature of a data buffer containing the raw instructions of an eBPF program, followed by the initial values of any maps used by the program. > 2. What does "bpftool gen -L" do? > eBPF programs often have 2 parts. An orchestrator/loader program that provides load -> attach/run -> i/o -> teardown logic and the in-kernel program. That command is used to generate a skeleton which can be used by the orchestrator prgoram. Skeletons get generated as a C header file, that contains various autogenerated functions that open and load bpf programs as decribed above. That header file ends up being included in a userspace orchestrator program or possibly a kernel module. > This feedback maps to other examples too in the cover letter. > > BR, Jarkko I'll rework this with some definitions of the eBPF subsystem jargon along with your suggestions. -blaise
On Mon, Mar 31, 2025 at 01:57:15PM -0700, Blaise Boscaccy wrote: > There are two flavors of skeletons, normal skeletons, and light > skeletons. Normal skeletons utilize relocation logic that lives in > libbpf, and the relocations/instruction rewriting happen in userspace. > The second flavor, light skeletons, uses a small eBPF program that > contains the relocation lookup logic. As it's running in in the kernel, > it unpacks the target program, peforms the instruction rewriting, and > loads the target program. Light skeletons are currently utilized for > some drivers, and BPF_PRELOAD functionionality since they can operate > without userspace. > > Light skeletons were recommended on various mailing list discussions as > the preffered path to performing signature verification. There are some > PoCs floating around that used light-skeletons in concert with > fs-verity/IMA and eBPF LSMs. We took a slightly different approach to > Hornet, by utilizing the existing PCKS#7 signing scheme that is used for > kernel modules. Right, because in the normal skeletons relocation logic remains unsigned? I have to admit I don't fully cope how the relocation process translates into eBPF program but I do get how it is better for signatures if it does :-) > > >> verification. Signature data can be easily generated for the binary > > > > s/easily// > > > > Useless word having no measure. > > > > Ack, thanks. > > > >> data that is generated via bpftool gen -L. This signature can be > > > > I have no idea what that command does. > > > > "Signature data can be generated for the binary data as follows: > > > > bpftool gen -L > > > > <explanation>" > > > > Here you'd need to answer to couple of unknowns: > > > > 1. What is in exact terms "signature data"? > > That is a PKCS#7 signature of a data buffer containing the raw > instructions of an eBPF program, followed by the initial values of any > maps used by the program. Got it, thanks. This motivates to refine my TPM2 asymmetric keys series so that TPM2 could anchor these :-) https://lore.kernel.org/linux-integrity/20240528210823.28798-1-jarkko@kernel.org/ > > > 2. What does "bpftool gen -L" do? > > > > eBPF programs often have 2 parts. An orchestrator/loader program that > provides load -> attach/run -> i/o -> teardown logic and the in-kernel > program. > > That command is used to generate a skeleton which can be used by the > orchestrator prgoram. Skeletons get generated as a C header file, that > contains various autogenerated functions that open and load bpf programs > as decribed above. That header file ends up being included in a > userspace orchestrator program or possibly a kernel module. I did read the man page now too, but thanks for the commentary! > > > This feedback maps to other examples too in the cover letter. > > > > BR, Jarkko > > > I'll rework this with some definitions of the eBPF subsystem jargon > along with your suggestions. Yeah, you should be able to put the gist a factor better to nutshell :-) > > -blaise BR, Jarkko
Jarkko Sakkinen <jarkko@kernel.org> writes: > On Mon, Mar 31, 2025 at 01:57:15PM -0700, Blaise Boscaccy wrote: >> There are two flavors of skeletons, normal skeletons, and light >> skeletons. Normal skeletons utilize relocation logic that lives in >> libbpf, and the relocations/instruction rewriting happen in userspace. >> The second flavor, light skeletons, uses a small eBPF program that >> contains the relocation lookup logic. As it's running in in the kernel, >> it unpacks the target program, peforms the instruction rewriting, and >> loads the target program. Light skeletons are currently utilized for >> some drivers, and BPF_PRELOAD functionionality since they can operate >> without userspace. >> >> Light skeletons were recommended on various mailing list discussions as >> the preffered path to performing signature verification. There are some >> PoCs floating around that used light-skeletons in concert with >> fs-verity/IMA and eBPF LSMs. We took a slightly different approach to >> Hornet, by utilizing the existing PCKS#7 signing scheme that is used for >> kernel modules. > > Right, because in the normal skeletons relocation logic remains > unsigned? > Yup, Exactly. > I have to admit I don't fully cope how the relocation process translates > into eBPF program but I do get how it is better for signatures if it > does :-) > >> >> >> verification. Signature data can be easily generated for the binary >> > >> > s/easily// >> > >> > Useless word having no measure. >> > >> >> Ack, thanks. >> >> >> >> data that is generated via bpftool gen -L. This signature can be >> > >> > I have no idea what that command does. >> > >> > "Signature data can be generated for the binary data as follows: >> > >> > bpftool gen -L >> > >> > <explanation>" >> > >> > Here you'd need to answer to couple of unknowns: >> > >> > 1. What is in exact terms "signature data"? >> >> That is a PKCS#7 signature of a data buffer containing the raw >> instructions of an eBPF program, followed by the initial values of any >> maps used by the program. > > Got it, thanks. This motivates to refine my TPM2 asymmetric keys > series so that TPM2 could anchor these :-) > > https://lore.kernel.org/linux-integrity/20240528210823.28798-1-jarkko@kernel.org/ > > Oooh. That would be very nice :) >> >> > 2. What does "bpftool gen -L" do? >> > >> >> eBPF programs often have 2 parts. An orchestrator/loader program that >> provides load -> attach/run -> i/o -> teardown logic and the in-kernel >> program. >> >> That command is used to generate a skeleton which can be used by the >> orchestrator prgoram. Skeletons get generated as a C header file, that >> contains various autogenerated functions that open and load bpf programs >> as decribed above. That header file ends up being included in a >> userspace orchestrator program or possibly a kernel module. > > I did read the man page now too, but thanks for the commentary! > >> >> > This feedback maps to other examples too in the cover letter. >> > >> > BR, Jarkko >> >> >> I'll rework this with some definitions of the eBPF subsystem jargon >> along with your suggestions. > > Yeah, you should be able to put the gist a factor better to nutshell :-) > >> >> -blaise > > BR, Jarkko