diff mbox series

[1/3] mm: vmalloc: export __vmalloc_node_range

Message ID 20240718011504.4106163-2-mstrodl@csh.rit.edu (mailing list archive)
State New
Headers show
Series Add support for Congatec CGEB BIOS interface | expand

Commit Message

Mary Strodl July 18, 2024, 1:15 a.m. UTC
After the ability to allocate PAGE_KERNEL_EXEC memory was removed
from __vmalloc, this seems like the least invasive way to expose
the capability to drivers that need it.

Exports __vmalloc_node_range so that drivers can use it.

Signed-off-by: Mary Strodl <mstrodl@csh.rit.edu>
---
 mm/vmalloc.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Andrew Morton July 18, 2024, 2:53 a.m. UTC | #1
On Wed, 17 Jul 2024 21:15:02 -0400 Mary Strodl <mstrodl@csh.rit.edu> wrote:

> After the ability to allocate PAGE_KERNEL_EXEC memory was removed
> from __vmalloc

Removed by which commit?

> this seems like the least invasive way to expose
> the capability to drivers that need it.
> 
> Exports __vmalloc_node_range so that drivers can use it.

Why does this driver need a thing which no other driver does?
Christoph Hellwig July 18, 2024, 3:04 a.m. UTC | #2
On Wed, Jul 17, 2024 at 09:15:02PM -0400, Mary Strodl wrote:
> After the ability to allocate PAGE_KERNEL_EXEC memory was removed
> from __vmalloc,

Yes. for a very good reason.

NAK to a driver creating random writable and exectutable memory:

Nacked-by: Christoph Hellwig <hch@lst.de>
Mary Strodl July 18, 2024, 12:29 p.m. UTC | #3
On Wed, Jul 17, 2024 at 07:53:23PM -0700, Andrew Morton wrote:
> Removed by which commit?

Thanks. This was removed in:
88dca4ca5a93 mm: remove the pgprot argument to __vmalloc

It was removed because every driver was passing PAGE_KERNEL

> Why does this driver need a thing which no other driver does?

You can	find more information in the manufacturer's docs:
https://www.congatec.com/fileadmin/user_upload/Documents/Manual/CGOSAPI.pdf

In particular, section 1 (page 11) describes how they intend for
it to work.

Basically, they	provide	an x86 blob in the BIOS	that we	copy
into kernelspace and mark executable. Then, we can call into that
blob to access congatec's special hardware.

Thanks again for reviewing my patches!

If there is a better way to do what I'm trying to do, please let
me know. It's possible I'm missing something obvious.
Mary Strodl July 18, 2024, 12:40 p.m. UTC | #4
On Wed, Jul 17, 2024 at 08:04:01PM -0700, Christoph Hellwig wrote:
> NAK to a driver creating random writable and exectutable memory:

Is there a better way to do what I'm trying to do? Or some way to
expose this functionality with more guard rails so that it's a
little bit safer?

Thank you for taking time to review my patches!
Matthew Wilcox July 18, 2024, 12:45 p.m. UTC | #5
On Thu, Jul 18, 2024 at 08:40:27AM -0400, Mary Strodl wrote:
> On Wed, Jul 17, 2024 at 08:04:01PM -0700, Christoph Hellwig wrote:
> > NAK to a driver creating random writable and exectutable memory:
> 
> Is there a better way to do what I'm trying to do? Or some way to
> expose this functionality with more guard rails so that it's a
> little bit safer?

No, there is no way to do what you're trying to do.
Christoph Hellwig July 18, 2024, 12:49 p.m. UTC | #6
On Thu, Jul 18, 2024 at 01:45:11PM +0100, Matthew Wilcox wrote:
> On Thu, Jul 18, 2024 at 08:40:27AM -0400, Mary Strodl wrote:
> > On Wed, Jul 17, 2024 at 08:04:01PM -0700, Christoph Hellwig wrote:
> > > NAK to a driver creating random writable and exectutable memory:
> > 
> > Is there a better way to do what I'm trying to do? Or some way to
> > expose this functionality with more guard rails so that it's a
> > little bit safer?
> 
> No, there is no way to do what you're trying to do.

IFF it is absolutely required to run BIOS provided executable code,
the best way to do that is in a separate userspace process.
Matthew Wilcox July 18, 2024, 12:53 p.m. UTC | #7
On Thu, Jul 18, 2024 at 05:49:14AM -0700, Christoph Hellwig wrote:
> On Thu, Jul 18, 2024 at 01:45:11PM +0100, Matthew Wilcox wrote:
> > On Thu, Jul 18, 2024 at 08:40:27AM -0400, Mary Strodl wrote:
> > > On Wed, Jul 17, 2024 at 08:04:01PM -0700, Christoph Hellwig wrote:
> > > > NAK to a driver creating random writable and exectutable memory:
> > > 
> > > Is there a better way to do what I'm trying to do? Or some way to
> > > expose this functionality with more guard rails so that it's a
> > > little bit safer?
> > 
> > No, there is no way to do what you're trying to do.
> 
> IFF it is absolutely required to run BIOS provided executable code,
> the best way to do that is in a separate userspace process.

That does work, but I would assume that since this BIOS exists to
communicate with the hardware that it'd need various special privileges
and that running in ring 3 would not work.

Ultimately, better off running the whole thing inside a VM and passing
the device through to the guest.  Or ignoring the BIOS entirely and
implementing direct access to the hardware.  But neither of these
approaches is "a way to do what I'm trying to do", they're entirely
different approaches to making this hardware work.
Mary Strodl July 18, 2024, 1:20 p.m. UTC | #8
On Thu, Jul 18, 2024 at 01:53:23PM +0100, Matthew Wilcox wrote:
> That does work, but I would assume that since this BIOS exists to
> communicate with the hardware that it'd need various special privileges
> and that running in ring 3 would not work.

Exactly.
 
> Ultimately, better off running the whole thing inside a VM and passing
> the device through to the guest.  Or ignoring the BIOS entirely and
> implementing direct access to the hardware.  But neither of these
> approaches is "a way to do what I'm trying to do", they're entirely
> different approaches to making this hardware work.

If I ran the whole thing inside a VM, I would still need support in the
kernel, right?

As far as I know, there is no documentation on Congatec's side about the
underlying interface. Obviously I could disassemble the blob in the BIOS
and figure it out, but I suspect that will have much less hardware
compatibility and be subject to random breakage if they make a BIOS
update or something. Plus, I would probably run afoul of copyright if I
wrote a driver after doing that.

I'm not really thrilled that this is their design either, but I'm not
sure that there is a better answer...

Thank you!
Andrew Morton July 18, 2024, 9:31 p.m. UTC | #9
On Thu, 18 Jul 2024 09:20:15 -0400 Mary Strodl <mstrodl@freedom.csh.rit.edu> wrote:

> On Thu, Jul 18, 2024 at 01:53:23PM +0100, Matthew Wilcox wrote:
> > That does work, but I would assume that since this BIOS exists to
> > communicate with the hardware that it'd need various special privileges
> > and that running in ring 3 would not work.
> 
> Exactly.
>  
> > Ultimately, better off running the whole thing inside a VM and passing
> > the device through to the guest.  Or ignoring the BIOS entirely and
> > implementing direct access to the hardware.  But neither of these
> > approaches is "a way to do what I'm trying to do", they're entirely
> > different approaches to making this hardware work.
> 
> If I ran the whole thing inside a VM, I would still need support in the
> kernel, right?
> 
> As far as I know, there is no documentation on Congatec's side about the
> underlying interface. Obviously I could disassemble the blob in the BIOS
> and figure it out, but I suspect that will have much less hardware
> compatibility and be subject to random breakage if they make a BIOS
> update or something. Plus, I would probably run afoul of copyright if I
> wrote a driver after doing that.
> 
> I'm not really thrilled that this is their design either, but I'm not
> sure that there is a better answer...
> 

The hardware is weird, but we should try to support it in some fashion.
But without making dangerous functionality more widely available.  So
we're looking for some solution which can be fully contained within
that hardware's driver.

Dumb idea, there will be other ideas: is it practical to take that code
blob out of the BIOS, put it into a kernel module (as a .byte table in
a .s file and suitable C interfacing), compile that up and insmod that
module?
Matthew Wilcox July 18, 2024, 9:35 p.m. UTC | #10
On Thu, Jul 18, 2024 at 02:31:03PM -0700, Andrew Morton wrote:
> The hardware is weird, but we should try to support it in some fashion.

Why?  It's been around since 2005, and Linux has done perfectly well
without support for it.  What's so compelling about it, as compared to
ohidontknow the nVidia GPU driver where we definitely don't support
taking a binary blob of random x86 code and run it inside the kernel?

> Dumb idea, there will be other ideas: is it practical to take that code
> blob out of the BIOS, put it into a kernel module (as a .byte table in
> a .s file and suitable C interfacing), compile that up and insmod that
> module?  

Have you tried asking someone who cares about security like Kees?
Preferably in person so you can take a picture of the hair on their head
standing straight out and steam coming out of their ears.
Andrew Morton July 18, 2024, 9:39 p.m. UTC | #11
On Thu, 18 Jul 2024 22:35:02 +0100 Matthew Wilcox <willy@infradead.org> wrote:

> On Thu, Jul 18, 2024 at 02:31:03PM -0700, Andrew Morton wrote:
> > The hardware is weird, but we should try to support it in some fashion.
> 
> Why?  It's been around since 2005, and Linux has done perfectly well
> without support for it. 

Oh.  I was assuming it was some new thing.  This does weaken the case
a lot.
Christian Gmeiner July 19, 2024, 6:41 a.m. UTC | #12
>
> On Thu, 18 Jul 2024 22:35:02 +0100 Matthew Wilcox <willy@infradead.org> wrote:
>
> > On Thu, Jul 18, 2024 at 02:31:03PM -0700, Andrew Morton wrote:
> > > The hardware is weird, but we should try to support it in some fashion.
> >
> > Why?  It's been around since 2005, and Linux has done perfectly well
> > without support for it.
>
> Oh.  I was assuming it was some new thing.  This does weaken the case
> a lot.

This wonderful interface is used in recent products from them too.
Adding support for it
in an upstream-able way could be still a benefit, as these products
are used in different
industrial environments running on Linux.
Mary Strodl July 19, 2024, 11:58 a.m. UTC | #13
On Fri, Jul 19, 2024 at 08:41:04AM +0200, Christian Gmeiner wrote:
> This wonderful interface is used in recent products from them too.
> Adding support for it
> in an upstream-able way could be still a benefit, as these products
> are used in different
> industrial environments running on Linux.

Just seconding this. The hardware we have here (conga-TCA7) was
released in 2021. As far as I know, congatec have been using this
interface for a while and provided a pretty bad out-of-tree driver
for it that needed a proprietary library in userspace to talk to
devices instead of actually registering with the kernel facilities
for i2c, watchdog, backlight, etc.

I think it's valuable functionality to support, but it'll need to
happen safely.

Maybe some of the stuff the driver does right now could be moved into
vmalloc? In other words, we could provide a different function that
allocates an executable page, copies memory into it, then marks it
read-only. Would that do better to alleviate concerns?

Not sure what the restrictions on x86 are, but we could also start
with a writable page, then mark it executable when we un-mark it
writable.

I think this is good discussion, thanks for sharing your thoughts
everybody.
Matthew Wilcox July 19, 2024, 12:42 p.m. UTC | #14
On Fri, Jul 19, 2024 at 07:58:40AM -0400, Mary Strodl wrote:
> Maybe some of the stuff the driver does right now could be moved into
> vmalloc? In other words, we could provide a different function that
> allocates an executable page, copies memory into it, then marks it
> read-only. Would that do better to alleviate concerns?

No.  We are not running arbitrary x86 code.  That is a security
nightmare.
Rudolf Marek July 19, 2024, 7:59 p.m. UTC | #15
Hi,

Dne 19. 07. 24 v 13:58 Mary Strodl napsal(a):
> I think this is good discussion, thanks for sharing your thoughts
> everybody.

I would suggest to simply run the BIOS code of this interface in usermode. Sort of similar to VM86 VESA stuff.
Last time I looked into this it used STI/CLI/RDMSR/WRMSR and couple of I/O ports and cf8/cfc for PCI.

Thanks,
Rudolf
Mary Strodl July 22, 2024, 2:54 p.m. UTC | #16
On Fri, Jul 19, 2024 at 09:59:37PM +0200, Rudolf Marek wrote:
> I would suggest to simply run the BIOS code of this interface in usermode. Sort of similar to VM86 VESA stuff.
> Last time I looked into this it used STI/CLI/RDMSR/WRMSR and couple of I/O ports and cf8/cfc for PCI.

I took a look at uvesafb (which appears to be what you were talking about) and
it looks like it starts a separate executable and uses some IPC to talk to it.
Is that the best way to do it?

I guess it would look something like:
- driver gets loaded
- driver spawns /sbin/cgeb-helper
- driver uses cn_netlink_send to send the `high_desc` to helper

Then the calls to `board->entry` in the driver get replaced with
`cn_netlink_send` with a `cgeb_fps`.

When the userspace helper gets the message with the `cgeb_fps`, it calls into
the bios code and replies to the driver with send() and passes back cgeb_fps.

From there, a callback registered with cn_add_callback will pick up the message
and call `wake_up_interruptible` to send the message back to the `cgeb_call`
caller.

Is this what you were imagining? Or is there a simpler way to do it?

Where should the code for the userspace helper live? The vesa stuff appeared to
be out of tree.
Andrew Morton July 24, 2024, midnight UTC | #17
On Fri, 19 Jul 2024 13:42:40 +0100 Matthew Wilcox <willy@infradead.org> wrote:

> On Fri, Jul 19, 2024 at 07:58:40AM -0400, Mary Strodl wrote:
> > Maybe some of the stuff the driver does right now could be moved into
> > vmalloc? In other words, we could provide a different function that
> > allocates an executable page, copies memory into it, then marks it
> > read-only. Would that do better to alleviate concerns?
> 
> No.  We are not running arbitrary x86 code.  That is a security
> nightmare.

Sure, if such a thing were to be done we'd want it localized within the
driver rather than offered globally.

But if there was some hack within the driver to do this, what problems
might that cause?  What are the scenarios?
Matthew Wilcox July 24, 2024, 12:16 a.m. UTC | #18
On Tue, Jul 23, 2024 at 05:00:43PM -0700, Andrew Morton wrote:
> On Fri, 19 Jul 2024 13:42:40 +0100 Matthew Wilcox <willy@infradead.org> wrote:
> 
> > On Fri, Jul 19, 2024 at 07:58:40AM -0400, Mary Strodl wrote:
> > > Maybe some of the stuff the driver does right now could be moved into
> > > vmalloc? In other words, we could provide a different function that
> > > allocates an executable page, copies memory into it, then marks it
> > > read-only. Would that do better to alleviate concerns?
> > 
> > No.  We are not running arbitrary x86 code.  That is a security
> > nightmare.
> 
> Sure, if such a thing were to be done we'd want it localized within the
> driver rather than offered globally.
> 
> But if there was some hack within the driver to do this, what problems
> might that cause?  What are the scenarios?

That we're running arbitrary x86 code (provided by the manufacturer)
inside the kernel where it can undermine every security guarantee we
provide?
Christoph Hellwig July 24, 2024, 1:36 a.m. UTC | #19
On Wed, Jul 24, 2024 at 01:16:02AM +0100, Matthew Wilcox wrote:
> That we're running arbitrary x86 code (provided by the manufacturer)
> inside the kernel where it can undermine every security guarantee we
> provide?

.. and by exporting the interface allow arbitrary other code including
exploit code to allocate writable and executable memory?
diff mbox series

Patch

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e34ea860153f..037b7e0fe430 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3879,6 +3879,7 @@  void *__vmalloc_node_range_noprof(unsigned long size, unsigned long align,
 
 	return NULL;
 }
+EXPORT_SYMBOL(__vmalloc_node_range_noprof);
 
 /**
  * __vmalloc_node - allocate virtually contiguous memory