diff mbox series

[v4,2/3] rust: pci: impl TryFrom<&Device> for &pci::Device

Message ID 20250321214826.140946-3-dakr@kernel.org (mailing list archive)
State Handled Elsewhere
Delegated to: Bjorn Helgaas
Headers show
Series Implement TryFrom<&Device> for bus specific devices | expand

Commit Message

Danilo Krummrich March 21, 2025, 9:47 p.m. UTC
Implement TryFrom<&device::Device> for &Device.

This allows us to get a &pci::Device from a generic &Device in a safe
way; the conversion fails if the device' bus type does not match with
the PCI bus type.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 rust/kernel/device.rs |  1 -
 rust/kernel/pci.rs    | 25 +++++++++++++++++++++++--
 2 files changed, 23 insertions(+), 3 deletions(-)

Comments

Greg KH March 22, 2025, 3:25 a.m. UTC | #1
On Fri, Mar 21, 2025 at 10:47:54PM +0100, Danilo Krummrich wrote:
> Implement TryFrom<&device::Device> for &Device.
> 
> This allows us to get a &pci::Device from a generic &Device in a safe
> way; the conversion fails if the device' bus type does not match with
> the PCI bus type.

In thinking about this some more, I am starting to not really like it at
all, more below...

> +impl TryFrom<&device::Device> for &Device {
> +    type Error = kernel::error::Error;
> +
> +    fn try_from(dev: &device::Device) -> Result<Self, Self::Error> {
> +        #[allow(unused_unsafe)]
> +        // SAFETY: rustc < 1.82 falsely treats this as unsafe, see:
> +        // https://blog.rust-lang.org/2024/10/17/Rust-1.82.0.html#safely-addressing-unsafe-statics
> +        if dev.bus_type_raw() != unsafe { addr_of!(bindings::pci_bus_type) } {
> +            return Err(EINVAL);
> +        }

For the record, and to write it down so it's just not a discussion in
person like I've had in the past about this topic, I figured I'd say
something about why we didn't do this in C.

When we wrote the driver model/core all those decades ago, we made the
explicit decision to NOT encode the device type in the device structure.
We did this to "force" the user to "know" the type before casting it
away to the real type, allowing it to be done in a way that would always
"just work" because an error could never occur.

This is not a normal "design pattern" for objects, even then (i.e.
gobject does not do this), and over the years I still think this was a
good decision.  It allowed us to keep the sysfs callbacks simpler in
that it saved us yet-another-function-call-deep, and it also allowed us
to do some "fun" pointer casting tricks for sysfs attributes (much to
CFI's nightmare many years later), and it kept drivers simple, which in
the end is the key.  It also forced driver authors from doing "tricks"
where they could just try to figure out the device type and then do
something with that for many many years until we had to give in and
allow this to happen due to multi-device-sharing subsystems.  And even
then, I think that was the wrong choice.

Yes, over time, many subsystems were forced to come up with ways of
determining the device type, look at dev_is_pci() and where it's used
all over the place (wait, why can't you just use that here too?)  But
those usages are the rare exception.  And I still think that was the
wrong choice.

What I don't want to see is this type of function to be the only way you
can get a pci device from a struct device in rust.  That's going to be
messy as that is going to be a very common occurance (maybe, I haven't
seen how you use this), and would force everyone to always test the
return value, adding complexity and additional work for everyone, just
because the few wants it.

So where/how are you going to use this?  Why can't you just store the
"real" reference to the pci device?  If you want to save off a generic
device reference, where did it come from?  Why can't you just explicitly
save off the correct type at the time you stored it?  Is this just
attempting to save a few pointers?  Is this going to be required to be
called as the only way to get from a device to a pci device in a rust
driver?

Along these lines, if you can convince me that this is something that we
really should be doing, in that we should always be checking every time
someone would want to call to_pci_dev(), that the return value is
checked, then why don't we also do this in C if it's going to be
something to assure people it is going to be correct?  I don't want to
see the rust and C sides get "out of sync" here for things that can be
kept in sync, as that reduces the mental load of all of us as we travers
across the boundry for the next 20+ years.

Note, it's almost 10x more common to just use to_pci_dev() on it's own
and not call dev_is_pci(), so that gives you a sense of just how much
work every individual driver would have to do for such a rare
requirement.

thanks,

greg k-h
Danilo Krummrich March 22, 2025, 10:10 a.m. UTC | #2
On Fri, Mar 21, 2025 at 08:25:07PM -0700, Greg KH wrote:
> On Fri, Mar 21, 2025 at 10:47:54PM +0100, Danilo Krummrich wrote:
>
> Yes, over time, many subsystems were forced to come up with ways of
> determining the device type, look at dev_is_pci() and where it's used
> all over the place (wait, why can't you just use that here too?)

It's a macro and bindgen doesn't pick it up for us to use.

> But
> those usages are the rare exception.  And I still think that was the
> wrong choice.
> 
> What I don't want to see is this type of function to be the only way you
> can get a pci device from a struct device in rust.

I see where you're coming from and I totally agree with that.

The goal is to cover as much as possible by the corresponding abstractions, such
that the abstraction already provide the driver with the correct (device) type,
just like the bus callbacks do.

We can do the same thing with IOCTLs, sysfs ops, etc. The miscdevice
abstractions are a good example for this. The DRM abstractions will also provide
the DRM device (typed to its corresponding driver) for all DRM IOCTLs a driver
registers.

> That's going to be
> messy as that is going to be a very common occurance (maybe, I haven't
> seen how you use this), and would force everyone to always test the
> return value, adding complexity and additional work for everyone, just
> because the few wants it.

Given the above, I think in most cases we are (and plan to be) a step ahead and
(almost) never have the driver in the situation that it ever needs any
container_of() like thing, because the abstraction already provides the correct
type.

However, there may be some rare exceptions, where we need the driver to
"upcast", more below.

> So where/how are you going to use this?  Why can't you just store the
> "real" reference to the pci device?  If you want to save off a generic
> device reference, where did it come from?  Why can't you just explicitly
> save off the correct type at the time you stored it?  Is this just
> attempting to save a few pointers?  Is this going to be required to be
> called as the only way to get from a device to a pci device in a rust
> driver?

As mentioned above, wherever the abstraction can already provide the correct
(device) type to the driver, that's what we should do instead.

One specific use-case I see with this is for the auxiliary bus and likely MFD,
where we want to access the parent of a certain device.

For instance, in nova-drm we'll probe against an auxiliary device registered by
nova-core. nova-drm will use its auxiliary device to call into nova-core.
nova-core can then validate that it is the originator of the auxiliary device
(e.g. by name and ID, or simply pointer comparison).

Subsequently, nova-core can find its (in this case) pci::Device instance through
the parent device (Note that I changed this, such that you can only get the
parent device from an auxiliary::Device, as discussed).

	pub fn some_operation(adev: &auxiliary::Device, ...) -> ... {
	   // validate that `adev` was created by us

	   if let Some(parent) = adev.parent() {
              let pdev: &pci::Device = parent.try_into()?;
	      ...
	   }
	}

Now, I understand that your concern isn't necessarily about the fact that we
"upcast", but that it is fallible.

And you're not wrong about this, nova-core knows for sure that the parent device
must be its PCI device, since it can indeed validate that the auxiliary device
was created and registered by itself.

However, any other API that does not validate that `parent` is actually a
pci::Device in the example above would be fundamentally unsafe. And I think we
don't want to give out unsafe APIs to drivers.

Given that this is only needed in very rare cases (since most of the time the
driver should be provided with the correct type directly) I'm not concerned
about the small overhead of the underlying pointer comparison. It's also not
creating messy code as it would be in C.

In C it would be

		struct pci_dev *pdev;

		if (!dev_is_pci(parent))
			return -EINVAL;

		pdev = to_pci_dev(parent);

Whereas in Rust it's only

		let pdev: &pci::Device = parent.try_into()?;

and without any validation in Rust

		// SAFETY: explain why this is safe to do
		let pdev = unsafe { pci::Device::from_device(parent) };

> Along these lines, if you can convince me that this is something that we
> really should be doing, in that we should always be checking every time
> someone would want to call to_pci_dev(), that the return value is
> checked, then why don't we also do this in C if it's going to be
> something to assure people it is going to be correct?  I don't want to
> see the rust and C sides get "out of sync" here for things that can be
> kept in sync, as that reduces the mental load of all of us as we travers
> across the boundry for the next 20+ years.

I think in this case it is good when the C and Rust side get a bit
"out of sync":

For most of the cases where we can cover things through the type system and
already provide the driver with the correct object instance from the get-go,
that's a great improvement over C.

For the very rare cases where we have to "upcast", I really think it's better to
uphold providing safe APIs to drivers, rather than unsafe ones.
Danilo Krummrich March 23, 2025, 10:10 p.m. UTC | #3
On Sat, Mar 22, 2025 at 11:10:57AM +0100, Danilo Krummrich wrote:
> On Fri, Mar 21, 2025 at 08:25:07PM -0700, Greg KH wrote:
> > On Fri, Mar 21, 2025 at 10:47:54PM +0100, Danilo Krummrich wrote:
> >
> > Yes, over time, many subsystems were forced to come up with ways of
> > determining the device type, look at dev_is_pci() and where it's used
> > all over the place (wait, why can't you just use that here too?)
> 
> It's a macro and bindgen doesn't pick it up for us to use.
> 
> > But
> > those usages are the rare exception.  And I still think that was the
> > wrong choice.
> > 
> > What I don't want to see is this type of function to be the only way you
> > can get a pci device from a struct device in rust.
> 
> I see where you're coming from and I totally agree with that.
> 
> The goal is to cover as much as possible by the corresponding abstractions, such
> that the abstraction already provide the driver with the correct (device) type,
> just like the bus callbacks do.
> 
> We can do the same thing with IOCTLs, sysfs ops, etc. The miscdevice
> abstractions are a good example for this. The DRM abstractions will also provide
> the DRM device (typed to its corresponding driver) for all DRM IOCTLs a driver
> registers.
> 
> > That's going to be
> > messy as that is going to be a very common occurance (maybe, I haven't
> > seen how you use this), and would force everyone to always test the
> > return value, adding complexity and additional work for everyone, just
> > because the few wants it.
> 
> Given the above, I think in most cases we are (and plan to be) a step ahead and
> (almost) never have the driver in the situation that it ever needs any
> container_of() like thing, because the abstraction already provides the correct
> type.
> 
> However, there may be some rare exceptions, where we need the driver to
> "upcast", more below.
> 
> > So where/how are you going to use this?  Why can't you just store the
> > "real" reference to the pci device?  If you want to save off a generic
> > device reference, where did it come from?  Why can't you just explicitly
> > save off the correct type at the time you stored it?  Is this just
> > attempting to save a few pointers?  Is this going to be required to be
> > called as the only way to get from a device to a pci device in a rust
> > driver?
> 
> As mentioned above, wherever the abstraction can already provide the correct
> (device) type to the driver, that's what we should do instead.
> 
> One specific use-case I see with this is for the auxiliary bus and likely MFD,
> where we want to access the parent of a certain device.
> 
> For instance, in nova-drm we'll probe against an auxiliary device registered by
> nova-core. nova-drm will use its auxiliary device to call into nova-core.
> nova-core can then validate that it is the originator of the auxiliary device
> (e.g. by name and ID, or simply pointer comparison).
> 
> Subsequently, nova-core can find its (in this case) pci::Device instance through
> the parent device (Note that I changed this, such that you can only get the
> parent device from an auxiliary::Device, as discussed).
> 
> 	pub fn some_operation(adev: &auxiliary::Device, ...) -> ... {
> 	   // validate that `adev` was created by us
> 
> 	   if let Some(parent) = adev.parent() {
>               let pdev: &pci::Device = parent.try_into()?;
> 	      ...
> 	   }
> 	}
> 
> Now, I understand that your concern isn't necessarily about the fact that we
> "upcast", but that it is fallible.
> 
> And you're not wrong about this, nova-core knows for sure that the parent device
> must be its PCI device, since it can indeed validate that the auxiliary device
> was created and registered by itself.
> 
> However, any other API that does not validate that `parent` is actually a
> pci::Device in the example above would be fundamentally unsafe. And I think we
> don't want to give out unsafe APIs to drivers.
> 
> Given that this is only needed in very rare cases (since most of the time the
> driver should be provided with the correct type directly) I'm not concerned
> about the small overhead of the underlying pointer comparison. It's also not
> creating messy code as it would be in C.
> 
> In C it would be
> 
> 		struct pci_dev *pdev;
> 
> 		if (!dev_is_pci(parent))
> 			return -EINVAL;
> 
> 		pdev = to_pci_dev(parent);
> 
> Whereas in Rust it's only
> 
> 		let pdev: &pci::Device = parent.try_into()?;
> 
> and without any validation in Rust
> 
> 		// SAFETY: explain why this is safe to do
> 		let pdev = unsafe { pci::Device::from_device(parent) };
> 
> > Along these lines, if you can convince me that this is something that we
> > really should be doing, in that we should always be checking every time
> > someone would want to call to_pci_dev(), that the return value is
> > checked, then why don't we also do this in C if it's going to be
> > something to assure people it is going to be correct?  I don't want to
> > see the rust and C sides get "out of sync" here for things that can be
> > kept in sync, as that reduces the mental load of all of us as we travers
> > across the boundry for the next 20+ years.
> 
> I think in this case it is good when the C and Rust side get a bit
> "out of sync":

A bit more clarification on this:

What I want to say with this is, since we can cover a lot of the common cases
through abstractions and the type system, we're left with the not so common
ones, where the "upcasts" are not made in the context of common and well
established patterns, but, for instance, depend on the semantics of the driver;
those should not be unsafe IMHO.

I think for C the situation is a bit different, because there we don't have a
type system that can take care of the majority of cases (at compile time). Every
such operation is fundamentally unsafe. So, it's a bit hard to draw a boundary.

(In Rust, it's pretty simple to draw the boundary; if it can't be covered by the
type system or by the abstraction, we need to check to uphold the safety
guarantees.)

So, for the majority of cases, since it's *always* an additional runtime check
in C, I think we shouldn't have it. But probably there are rare cases where it
also wouldn't be the worst idea to check. :)

In Rust we should be able to cover the majority of cases through abstractions
and the type system, but in the cases where we can't do so, I don't think we
should fall back to unsafe code for drivers.

> For most of the cases where we can cover things through the type system and
> already provide the driver with the correct object instance from the get-go,
> that's a great improvement over C.
> 
> For the very rare cases where we have to "upcast", I really think it's better to
> uphold providing safe APIs to drivers, rather than unsafe ones.
Greg KH March 24, 2025, 1:54 p.m. UTC | #4
On Sun, Mar 23, 2025 at 11:10:27PM +0100, Danilo Krummrich wrote:
> On Sat, Mar 22, 2025 at 11:10:57AM +0100, Danilo Krummrich wrote:
> > On Fri, Mar 21, 2025 at 08:25:07PM -0700, Greg KH wrote:
> > > On Fri, Mar 21, 2025 at 10:47:54PM +0100, Danilo Krummrich wrote:
> > >
> > > Yes, over time, many subsystems were forced to come up with ways of
> > > determining the device type, look at dev_is_pci() and where it's used
> > > all over the place (wait, why can't you just use that here too?)
> > 
> > It's a macro and bindgen doesn't pick it up for us to use.
> > 
> > > But
> > > those usages are the rare exception.  And I still think that was the
> > > wrong choice.
> > > 
> > > What I don't want to see is this type of function to be the only way you
> > > can get a pci device from a struct device in rust.
> > 
> > I see where you're coming from and I totally agree with that.
> > 
> > The goal is to cover as much as possible by the corresponding abstractions, such
> > that the abstraction already provide the driver with the correct (device) type,
> > just like the bus callbacks do.
> > 
> > We can do the same thing with IOCTLs, sysfs ops, etc. The miscdevice
> > abstractions are a good example for this. The DRM abstractions will also provide
> > the DRM device (typed to its corresponding driver) for all DRM IOCTLs a driver
> > registers.
> > 
> > > That's going to be
> > > messy as that is going to be a very common occurance (maybe, I haven't
> > > seen how you use this), and would force everyone to always test the
> > > return value, adding complexity and additional work for everyone, just
> > > because the few wants it.
> > 
> > Given the above, I think in most cases we are (and plan to be) a step ahead and
> > (almost) never have the driver in the situation that it ever needs any
> > container_of() like thing, because the abstraction already provides the correct
> > type.
> > 
> > However, there may be some rare exceptions, where we need the driver to
> > "upcast", more below.
> > 
> > > So where/how are you going to use this?  Why can't you just store the
> > > "real" reference to the pci device?  If you want to save off a generic
> > > device reference, where did it come from?  Why can't you just explicitly
> > > save off the correct type at the time you stored it?  Is this just
> > > attempting to save a few pointers?  Is this going to be required to be
> > > called as the only way to get from a device to a pci device in a rust
> > > driver?
> > 
> > As mentioned above, wherever the abstraction can already provide the correct
> > (device) type to the driver, that's what we should do instead.
> > 
> > One specific use-case I see with this is for the auxiliary bus and likely MFD,
> > where we want to access the parent of a certain device.
> > 
> > For instance, in nova-drm we'll probe against an auxiliary device registered by
> > nova-core. nova-drm will use its auxiliary device to call into nova-core.
> > nova-core can then validate that it is the originator of the auxiliary device
> > (e.g. by name and ID, or simply pointer comparison).
> > 
> > Subsequently, nova-core can find its (in this case) pci::Device instance through
> > the parent device (Note that I changed this, such that you can only get the
> > parent device from an auxiliary::Device, as discussed).
> > 
> > 	pub fn some_operation(adev: &auxiliary::Device, ...) -> ... {
> > 	   // validate that `adev` was created by us
> > 
> > 	   if let Some(parent) = adev.parent() {
> >               let pdev: &pci::Device = parent.try_into()?;
> > 	      ...
> > 	   }
> > 	}
> > 
> > Now, I understand that your concern isn't necessarily about the fact that we
> > "upcast", but that it is fallible.
> > 
> > And you're not wrong about this, nova-core knows for sure that the parent device
> > must be its PCI device, since it can indeed validate that the auxiliary device
> > was created and registered by itself.
> > 
> > However, any other API that does not validate that `parent` is actually a
> > pci::Device in the example above would be fundamentally unsafe. And I think we
> > don't want to give out unsafe APIs to drivers.
> > 
> > Given that this is only needed in very rare cases (since most of the time the
> > driver should be provided with the correct type directly) I'm not concerned
> > about the small overhead of the underlying pointer comparison. It's also not
> > creating messy code as it would be in C.
> > 
> > In C it would be
> > 
> > 		struct pci_dev *pdev;
> > 
> > 		if (!dev_is_pci(parent))
> > 			return -EINVAL;
> > 
> > 		pdev = to_pci_dev(parent);
> > 
> > Whereas in Rust it's only
> > 
> > 		let pdev: &pci::Device = parent.try_into()?;
> > 
> > and without any validation in Rust
> > 
> > 		// SAFETY: explain why this is safe to do
> > 		let pdev = unsafe { pci::Device::from_device(parent) };
> > 
> > > Along these lines, if you can convince me that this is something that we
> > > really should be doing, in that we should always be checking every time
> > > someone would want to call to_pci_dev(), that the return value is
> > > checked, then why don't we also do this in C if it's going to be
> > > something to assure people it is going to be correct?  I don't want to
> > > see the rust and C sides get "out of sync" here for things that can be
> > > kept in sync, as that reduces the mental load of all of us as we travers
> > > across the boundry for the next 20+ years.
> > 
> > I think in this case it is good when the C and Rust side get a bit
> > "out of sync":
> 
> A bit more clarification on this:
> 
> What I want to say with this is, since we can cover a lot of the common cases
> through abstractions and the type system, we're left with the not so common
> ones, where the "upcasts" are not made in the context of common and well
> established patterns, but, for instance, depend on the semantics of the driver;
> those should not be unsafe IMHO.
> 
> I think for C the situation is a bit different, because there we don't have a
> type system that can take care of the majority of cases (at compile time). Every
> such operation is fundamentally unsafe. So, it's a bit hard to draw a boundary.
> 
> (In Rust, it's pretty simple to draw the boundary; if it can't be covered by the
> type system or by the abstraction, we need to check to uphold the safety
> guarantees.)
> 
> So, for the majority of cases, since it's *always* an additional runtime check
> in C, I think we shouldn't have it. But probably there are rare cases where it
> also wouldn't be the worst idea to check. :)

Thanks for the long writeup, that makes more sense.

Ok, I'll not object to this change at all, so if you need them to go
through a different tree, feel free to take them.  But as I think it's
the merge window now, it will have to wait till -rc1 is out.

And yes, maybe we do need a version of this for C code. :)

> In Rust we should be able to cover the majority of cases through abstractions
> and the type system, but in the cases where we can't do so, I don't think we
> should fall back to unsafe code for drivers.
> 
> > For most of the cases where we can cover things through the type system and
> > already provide the driver with the correct object instance from the get-go,
> > that's a great improvement over C.
> > 
> > For the very rare cases where we have to "upcast", I really think it's better to
> > uphold providing safe APIs to drivers, rather than unsafe ones.

Fair enough, hopefully this makes things simpler for driver authors in
the end as it is not a common case.

greg k-h
Benno Lossin March 24, 2025, 4:39 p.m. UTC | #5
On Sun Mar 23, 2025 at 11:10 PM CET, Danilo Krummrich wrote:
> On Sat, Mar 22, 2025 at 11:10:57AM +0100, Danilo Krummrich wrote:
>> On Fri, Mar 21, 2025 at 08:25:07PM -0700, Greg KH wrote:
>> > Along these lines, if you can convince me that this is something that we
>> > really should be doing, in that we should always be checking every time
>> > someone would want to call to_pci_dev(), that the return value is
>> > checked, then why don't we also do this in C if it's going to be
>> > something to assure people it is going to be correct?  I don't want to
>> > see the rust and C sides get "out of sync" here for things that can be
>> > kept in sync, as that reduces the mental load of all of us as we travers
>> > across the boundry for the next 20+ years.
>> 
>> I think in this case it is good when the C and Rust side get a bit
>> "out of sync":
>
> A bit more clarification on this:
>
> What I want to say with this is, since we can cover a lot of the common cases
> through abstractions and the type system, we're left with the not so common
> ones, where the "upcasts" are not made in the context of common and well
> established patterns, but, for instance, depend on the semantics of the driver;
> those should not be unsafe IMHO.

I don't think that we should use `TryFrom` for stuff that should only be
used seldomly. A function that we can document properly is a much better
fit, since we can point users to the "correct" API.

---
Cheers,
Benno
Danilo Krummrich March 24, 2025, 4:49 p.m. UTC | #6
On Mon, Mar 24, 2025 at 04:39:25PM +0000, Benno Lossin wrote:
> On Sun Mar 23, 2025 at 11:10 PM CET, Danilo Krummrich wrote:
> > On Sat, Mar 22, 2025 at 11:10:57AM +0100, Danilo Krummrich wrote:
> >> On Fri, Mar 21, 2025 at 08:25:07PM -0700, Greg KH wrote:
> >> > Along these lines, if you can convince me that this is something that we
> >> > really should be doing, in that we should always be checking every time
> >> > someone would want to call to_pci_dev(), that the return value is
> >> > checked, then why don't we also do this in C if it's going to be
> >> > something to assure people it is going to be correct?  I don't want to
> >> > see the rust and C sides get "out of sync" here for things that can be
> >> > kept in sync, as that reduces the mental load of all of us as we travers
> >> > across the boundry for the next 20+ years.
> >> 
> >> I think in this case it is good when the C and Rust side get a bit
> >> "out of sync":
> >
> > A bit more clarification on this:
> >
> > What I want to say with this is, since we can cover a lot of the common cases
> > through abstractions and the type system, we're left with the not so common
> > ones, where the "upcasts" are not made in the context of common and well
> > established patterns, but, for instance, depend on the semantics of the driver;
> > those should not be unsafe IMHO.
> 
> I don't think that we should use `TryFrom` for stuff that should only be
> used seldomly. A function that we can document properly is a much better
> fit, since we can point users to the "correct" API.

Most of the cases where drivers would do this conversion should be covered by
the abstraction to already provide that actual bus specific device, rather than
a generic one or some priv pointer, etc.

So, the point is that the APIs we design won't leave drivers with a reason to
make this conversion in the first place. For the cases where they have to
(which should be rare), it's the right thing to do. There is not an alternative
API to point to.
Danilo Krummrich March 24, 2025, 5:05 p.m. UTC | #7
On Mon, Mar 24, 2025 at 06:54:47AM -0700, Greg KH wrote:
> 
> Thanks for the long writeup, that makes more sense.
> 
> Ok, I'll not object to this change at all, so if you need them to go
> through a different tree, feel free to take them.  But as I think it's
> the merge window now, it will have to wait till -rc1 is out.

Yes, I'll have to figure out what's the best approach of merging things.
Considering subsequent work, I want to make sure to create as few merge
conflicts as possible.
Benno Lossin March 24, 2025, 5:36 p.m. UTC | #8
On Mon Mar 24, 2025 at 5:49 PM CET, Danilo Krummrich wrote:
> On Mon, Mar 24, 2025 at 04:39:25PM +0000, Benno Lossin wrote:
>> On Sun Mar 23, 2025 at 11:10 PM CET, Danilo Krummrich wrote:
>> > On Sat, Mar 22, 2025 at 11:10:57AM +0100, Danilo Krummrich wrote:
>> >> On Fri, Mar 21, 2025 at 08:25:07PM -0700, Greg KH wrote:
>> >> > Along these lines, if you can convince me that this is something that we
>> >> > really should be doing, in that we should always be checking every time
>> >> > someone would want to call to_pci_dev(), that the return value is
>> >> > checked, then why don't we also do this in C if it's going to be
>> >> > something to assure people it is going to be correct?  I don't want to
>> >> > see the rust and C sides get "out of sync" here for things that can be
>> >> > kept in sync, as that reduces the mental load of all of us as we travers
>> >> > across the boundry for the next 20+ years.
>> >> 
>> >> I think in this case it is good when the C and Rust side get a bit
>> >> "out of sync":
>> >
>> > A bit more clarification on this:
>> >
>> > What I want to say with this is, since we can cover a lot of the common cases
>> > through abstractions and the type system, we're left with the not so common
>> > ones, where the "upcasts" are not made in the context of common and well
>> > established patterns, but, for instance, depend on the semantics of the driver;
>> > those should not be unsafe IMHO.
>> 
>> I don't think that we should use `TryFrom` for stuff that should only be
>> used seldomly. A function that we can document properly is a much better
>> fit, since we can point users to the "correct" API.
>
> Most of the cases where drivers would do this conversion should be covered by
> the abstraction to already provide that actual bus specific device, rather than
> a generic one or some priv pointer, etc.
>
> So, the point is that the APIs we design won't leave drivers with a reason to
> make this conversion in the first place. For the cases where they have to
> (which should be rare), it's the right thing to do. There is not an alternative
> API to point to.

Yes, but for such a case, I wouldn't want to use `TryFrom`, since that
trait to me is a sign of a canonical way to convert a value. So if it
shouldn't be used lightly, then I would prefer a normal method. Even if
there is no alternative API, we could say that it is unusual to use it
and the correct type should normally be available. This kind of
documentation is not possible with `TryFrom`.

For a user it won't make a big difference, they'll just call a method
not named `try_from`.

---
Cheers,
Benno
Danilo Krummrich March 24, 2025, 6:13 p.m. UTC | #9
On Mon, Mar 24, 2025 at 05:36:45PM +0000, Benno Lossin wrote:
> On Mon Mar 24, 2025 at 5:49 PM CET, Danilo Krummrich wrote:
> > On Mon, Mar 24, 2025 at 04:39:25PM +0000, Benno Lossin wrote:
> >> On Sun Mar 23, 2025 at 11:10 PM CET, Danilo Krummrich wrote:
> >> > On Sat, Mar 22, 2025 at 11:10:57AM +0100, Danilo Krummrich wrote:
> >> >> On Fri, Mar 21, 2025 at 08:25:07PM -0700, Greg KH wrote:
> >> >> > Along these lines, if you can convince me that this is something that we
> >> >> > really should be doing, in that we should always be checking every time
> >> >> > someone would want to call to_pci_dev(), that the return value is
> >> >> > checked, then why don't we also do this in C if it's going to be
> >> >> > something to assure people it is going to be correct?  I don't want to
> >> >> > see the rust and C sides get "out of sync" here for things that can be
> >> >> > kept in sync, as that reduces the mental load of all of us as we travers
> >> >> > across the boundry for the next 20+ years.
> >> >> 
> >> >> I think in this case it is good when the C and Rust side get a bit
> >> >> "out of sync":
> >> >
> >> > A bit more clarification on this:
> >> >
> >> > What I want to say with this is, since we can cover a lot of the common cases
> >> > through abstractions and the type system, we're left with the not so common
> >> > ones, where the "upcasts" are not made in the context of common and well
> >> > established patterns, but, for instance, depend on the semantics of the driver;
> >> > those should not be unsafe IMHO.
> >> 
> >> I don't think that we should use `TryFrom` for stuff that should only be
> >> used seldomly. A function that we can document properly is a much better
> >> fit, since we can point users to the "correct" API.
> >
> > Most of the cases where drivers would do this conversion should be covered by
> > the abstraction to already provide that actual bus specific device, rather than
> > a generic one or some priv pointer, etc.
> >
> > So, the point is that the APIs we design won't leave drivers with a reason to
> > make this conversion in the first place. For the cases where they have to
> > (which should be rare), it's the right thing to do. There is not an alternative
> > API to point to.
> 
> Yes, but for such a case, I wouldn't want to use `TryFrom`, since that
> trait to me is a sign of a canonical way to convert a value.

Well, it is the canonical way to convert, it's just that by the design of other
abstractions drivers should very rarely get in the situation of needing it in
the first place.
Benno Lossin March 24, 2025, 6:32 p.m. UTC | #10
On Mon Mar 24, 2025 at 7:13 PM CET, Danilo Krummrich wrote:
> On Mon, Mar 24, 2025 at 05:36:45PM +0000, Benno Lossin wrote:
>> On Mon Mar 24, 2025 at 5:49 PM CET, Danilo Krummrich wrote:
>> > On Mon, Mar 24, 2025 at 04:39:25PM +0000, Benno Lossin wrote:
>> >> On Sun Mar 23, 2025 at 11:10 PM CET, Danilo Krummrich wrote:
>> >> > On Sat, Mar 22, 2025 at 11:10:57AM +0100, Danilo Krummrich wrote:
>> >> >> On Fri, Mar 21, 2025 at 08:25:07PM -0700, Greg KH wrote:
>> >> >> > Along these lines, if you can convince me that this is something that we
>> >> >> > really should be doing, in that we should always be checking every time
>> >> >> > someone would want to call to_pci_dev(), that the return value is
>> >> >> > checked, then why don't we also do this in C if it's going to be
>> >> >> > something to assure people it is going to be correct?  I don't want to
>> >> >> > see the rust and C sides get "out of sync" here for things that can be
>> >> >> > kept in sync, as that reduces the mental load of all of us as we travers
>> >> >> > across the boundry for the next 20+ years.
>> >> >> 
>> >> >> I think in this case it is good when the C and Rust side get a bit
>> >> >> "out of sync":
>> >> >
>> >> > A bit more clarification on this:
>> >> >
>> >> > What I want to say with this is, since we can cover a lot of the common cases
>> >> > through abstractions and the type system, we're left with the not so common
>> >> > ones, where the "upcasts" are not made in the context of common and well
>> >> > established patterns, but, for instance, depend on the semantics of the driver;
>> >> > those should not be unsafe IMHO.
>> >> 
>> >> I don't think that we should use `TryFrom` for stuff that should only be
>> >> used seldomly. A function that we can document properly is a much better
>> >> fit, since we can point users to the "correct" API.
>> >
>> > Most of the cases where drivers would do this conversion should be covered by
>> > the abstraction to already provide that actual bus specific device, rather than
>> > a generic one or some priv pointer, etc.
>> >
>> > So, the point is that the APIs we design won't leave drivers with a reason to
>> > make this conversion in the first place. For the cases where they have to
>> > (which should be rare), it's the right thing to do. There is not an alternative
>> > API to point to.
>> 
>> Yes, but for such a case, I wouldn't want to use `TryFrom`, since that
>> trait to me is a sign of a canonical way to convert a value.
>
> Well, it is the canonical way to convert, it's just that by the design of other
> abstractions drivers should very rarely get in the situation of needing it in
> the first place.

I'd still prefer it though, since one can spot a

    let dev = CustomDevice::checked_from(dev)?

much better in review than the `try_from` conversion. It also prevents
one from giving it to a generic interface expecting the `TryFrom` trait.

---
Cheers,
Benno
Danilo Krummrich April 1, 2025, 1:51 p.m. UTC | #11
On Mon, Mar 24, 2025 at 06:32:53PM +0000, Benno Lossin wrote:
> On Mon Mar 24, 2025 at 7:13 PM CET, Danilo Krummrich wrote:
> > On Mon, Mar 24, 2025 at 05:36:45PM +0000, Benno Lossin wrote:
> >> On Mon Mar 24, 2025 at 5:49 PM CET, Danilo Krummrich wrote:
> >> > On Mon, Mar 24, 2025 at 04:39:25PM +0000, Benno Lossin wrote:
> >> >> On Sun Mar 23, 2025 at 11:10 PM CET, Danilo Krummrich wrote:
> >> >> > On Sat, Mar 22, 2025 at 11:10:57AM +0100, Danilo Krummrich wrote:
> >> >> >> On Fri, Mar 21, 2025 at 08:25:07PM -0700, Greg KH wrote:
> >> >> >> > Along these lines, if you can convince me that this is something that we
> >> >> >> > really should be doing, in that we should always be checking every time
> >> >> >> > someone would want to call to_pci_dev(), that the return value is
> >> >> >> > checked, then why don't we also do this in C if it's going to be
> >> >> >> > something to assure people it is going to be correct?  I don't want to
> >> >> >> > see the rust and C sides get "out of sync" here for things that can be
> >> >> >> > kept in sync, as that reduces the mental load of all of us as we travers
> >> >> >> > across the boundry for the next 20+ years.
> >> >> >> 
> >> >> >> I think in this case it is good when the C and Rust side get a bit
> >> >> >> "out of sync":
> >> >> >
> >> >> > A bit more clarification on this:
> >> >> >
> >> >> > What I want to say with this is, since we can cover a lot of the common cases
> >> >> > through abstractions and the type system, we're left with the not so common
> >> >> > ones, where the "upcasts" are not made in the context of common and well
> >> >> > established patterns, but, for instance, depend on the semantics of the driver;
> >> >> > those should not be unsafe IMHO.
> >> >> 
> >> >> I don't think that we should use `TryFrom` for stuff that should only be
> >> >> used seldomly. A function that we can document properly is a much better
> >> >> fit, since we can point users to the "correct" API.
> >> >
> >> > Most of the cases where drivers would do this conversion should be covered by
> >> > the abstraction to already provide that actual bus specific device, rather than
> >> > a generic one or some priv pointer, etc.
> >> >
> >> > So, the point is that the APIs we design won't leave drivers with a reason to
> >> > make this conversion in the first place. For the cases where they have to
> >> > (which should be rare), it's the right thing to do. There is not an alternative
> >> > API to point to.
> >> 
> >> Yes, but for such a case, I wouldn't want to use `TryFrom`, since that
> >> trait to me is a sign of a canonical way to convert a value.
> >
> > Well, it is the canonical way to convert, it's just that by the design of other
> > abstractions drivers should very rarely get in the situation of needing it in
> > the first place.
> 
> I'd still prefer it though, since one can spot a
> 
>     let dev = CustomDevice::checked_from(dev)?
> 
> much better in review than the `try_from` conversion. It also prevents
> one from giving it to a generic interface expecting the `TryFrom` trait.

(I plan to rebase this on my series introducing the Bound device context [1].)

I thought about this for a while and I still think TryFrom is fine here.

At some point I want to replace this implementation with a macro, since the code
is pretty similar for bus specific devices. I think that's a bit cleaner with
TryFrom compared to with a custom method, since we'd need the bus specific
device to call the macro from the generic impl, i.e.

	impl<Ctx: DeviceContext> Device<Ctx>

rather than a specific one, which we can't control. We can control it for
TryFrom though.

However, I also do not really object to your proposal, hence I'm willing to make
the change.

Do you want to make a proposal for the corresponding doc comment switching to a
custom method?

[1] https://lore.kernel.org/lkml/20250331202805.338468-1-dakr@kernel.org/
Benno Lossin April 2, 2025, 12:05 a.m. UTC | #12
On Tue Apr 1, 2025 at 3:51 PM CEST, Danilo Krummrich wrote:
> On Mon, Mar 24, 2025 at 06:32:53PM +0000, Benno Lossin wrote:
>> On Mon Mar 24, 2025 at 7:13 PM CET, Danilo Krummrich wrote:
>> > On Mon, Mar 24, 2025 at 05:36:45PM +0000, Benno Lossin wrote:
>> >> On Mon Mar 24, 2025 at 5:49 PM CET, Danilo Krummrich wrote:
>> >> > On Mon, Mar 24, 2025 at 04:39:25PM +0000, Benno Lossin wrote:
>> >> >> On Sun Mar 23, 2025 at 11:10 PM CET, Danilo Krummrich wrote:
>> >> >> > On Sat, Mar 22, 2025 at 11:10:57AM +0100, Danilo Krummrich wrote:
>> >> >> >> On Fri, Mar 21, 2025 at 08:25:07PM -0700, Greg KH wrote:
>> >> >> >> > Along these lines, if you can convince me that this is something that we
>> >> >> >> > really should be doing, in that we should always be checking every time
>> >> >> >> > someone would want to call to_pci_dev(), that the return value is
>> >> >> >> > checked, then why don't we also do this in C if it's going to be
>> >> >> >> > something to assure people it is going to be correct?  I don't want to
>> >> >> >> > see the rust and C sides get "out of sync" here for things that can be
>> >> >> >> > kept in sync, as that reduces the mental load of all of us as we travers
>> >> >> >> > across the boundry for the next 20+ years.
>> >> >> >> 
>> >> >> >> I think in this case it is good when the C and Rust side get a bit
>> >> >> >> "out of sync":
>> >> >> >
>> >> >> > A bit more clarification on this:
>> >> >> >
>> >> >> > What I want to say with this is, since we can cover a lot of the common cases
>> >> >> > through abstractions and the type system, we're left with the not so common
>> >> >> > ones, where the "upcasts" are not made in the context of common and well
>> >> >> > established patterns, but, for instance, depend on the semantics of the driver;
>> >> >> > those should not be unsafe IMHO.
>> >> >> 
>> >> >> I don't think that we should use `TryFrom` for stuff that should only be
>> >> >> used seldomly. A function that we can document properly is a much better
>> >> >> fit, since we can point users to the "correct" API.
>> >> >
>> >> > Most of the cases where drivers would do this conversion should be covered by
>> >> > the abstraction to already provide that actual bus specific device, rather than
>> >> > a generic one or some priv pointer, etc.
>> >> >
>> >> > So, the point is that the APIs we design won't leave drivers with a reason to
>> >> > make this conversion in the first place. For the cases where they have to
>> >> > (which should be rare), it's the right thing to do. There is not an alternative
>> >> > API to point to.
>> >> 
>> >> Yes, but for such a case, I wouldn't want to use `TryFrom`, since that
>> >> trait to me is a sign of a canonical way to convert a value.
>> >
>> > Well, it is the canonical way to convert, it's just that by the design of other
>> > abstractions drivers should very rarely get in the situation of needing it in
>> > the first place.
>> 
>> I'd still prefer it though, since one can spot a
>> 
>>     let dev = CustomDevice::checked_from(dev)?
>> 
>> much better in review than the `try_from` conversion. It also prevents
>> one from giving it to a generic interface expecting the `TryFrom` trait.
>
> (I plan to rebase this on my series introducing the Bound device context [1].)
>
> I thought about this for a while and I still think TryFrom is fine here.

What reasoning do you have?

> At some point I want to replace this implementation with a macro, since the code
> is pretty similar for bus specific devices. I think that's a bit cleaner with
> TryFrom compared to with a custom method, since we'd need the bus specific
> device to call the macro from the generic impl, i.e.
>
> 	impl<Ctx: DeviceContext> Device<Ctx>
>
> rather than a specific one, which we can't control. We can control it for
> TryFrom though.

We could have our own trait for that. Also it's not as controllable as
you think: anyone can implement `TryFrom<&device::Device> for &MyType`.

> However, I also do not really object to your proposal, hence I'm willing to make
> the change.
>
> Do you want to make a proposal for the corresponding doc comment switching to a
> custom method?

I think have too little context what `device::Device` and `pci::Device`
are. But I can give it a try:

    /// Tries to converts a generic [`Device`](device::Device) into a [`pci::Device`].
    ///
    /// Normally, one wouldn't need to call this function, because APIs should directly expose the
    /// concrete device type.

Then I think another sentence about a valid use-case of this function
would make a lot of sense, but I don't know any.

---
Cheers,
Benno
Danilo Krummrich April 2, 2025, 9:06 a.m. UTC | #13
On Wed, Apr 02, 2025 at 12:05:56AM +0000, Benno Lossin wrote:
> On Tue Apr 1, 2025 at 3:51 PM CEST, Danilo Krummrich wrote:
> > On Mon, Mar 24, 2025 at 06:32:53PM +0000, Benno Lossin wrote:
> >> On Mon Mar 24, 2025 at 7:13 PM CET, Danilo Krummrich wrote:
> >> > On Mon, Mar 24, 2025 at 05:36:45PM +0000, Benno Lossin wrote:
> >> >> On Mon Mar 24, 2025 at 5:49 PM CET, Danilo Krummrich wrote:
> >> >> > On Mon, Mar 24, 2025 at 04:39:25PM +0000, Benno Lossin wrote:
> >> >> >> On Sun Mar 23, 2025 at 11:10 PM CET, Danilo Krummrich wrote:
> >> >> >> > On Sat, Mar 22, 2025 at 11:10:57AM +0100, Danilo Krummrich wrote:
> >> >> >> >> On Fri, Mar 21, 2025 at 08:25:07PM -0700, Greg KH wrote:
> >> >> >> >> > Along these lines, if you can convince me that this is something that we
> >> >> >> >> > really should be doing, in that we should always be checking every time
> >> >> >> >> > someone would want to call to_pci_dev(), that the return value is
> >> >> >> >> > checked, then why don't we also do this in C if it's going to be
> >> >> >> >> > something to assure people it is going to be correct?  I don't want to
> >> >> >> >> > see the rust and C sides get "out of sync" here for things that can be
> >> >> >> >> > kept in sync, as that reduces the mental load of all of us as we travers
> >> >> >> >> > across the boundry for the next 20+ years.
> >> >> >> >> 
> >> >> >> >> I think in this case it is good when the C and Rust side get a bit
> >> >> >> >> "out of sync":
> >> >> >> >
> >> >> >> > A bit more clarification on this:
> >> >> >> >
> >> >> >> > What I want to say with this is, since we can cover a lot of the common cases
> >> >> >> > through abstractions and the type system, we're left with the not so common
> >> >> >> > ones, where the "upcasts" are not made in the context of common and well
> >> >> >> > established patterns, but, for instance, depend on the semantics of the driver;
> >> >> >> > those should not be unsafe IMHO.
> >> >> >> 
> >> >> >> I don't think that we should use `TryFrom` for stuff that should only be
> >> >> >> used seldomly. A function that we can document properly is a much better
> >> >> >> fit, since we can point users to the "correct" API.
> >> >> >
> >> >> > Most of the cases where drivers would do this conversion should be covered by
> >> >> > the abstraction to already provide that actual bus specific device, rather than
> >> >> > a generic one or some priv pointer, etc.
> >> >> >
> >> >> > So, the point is that the APIs we design won't leave drivers with a reason to
> >> >> > make this conversion in the first place. For the cases where they have to
> >> >> > (which should be rare), it's the right thing to do. There is not an alternative
> >> >> > API to point to.
> >> >> 
> >> >> Yes, but for such a case, I wouldn't want to use `TryFrom`, since that
> >> >> trait to me is a sign of a canonical way to convert a value.
> >> >
> >> > Well, it is the canonical way to convert, it's just that by the design of other
> >> > abstractions drivers should very rarely get in the situation of needing it in
> >> > the first place.
> >> 
> >> I'd still prefer it though, since one can spot a
> >> 
> >>     let dev = CustomDevice::checked_from(dev)?
> >> 
> >> much better in review than the `try_from` conversion. It also prevents
> >> one from giving it to a generic interface expecting the `TryFrom` trait.
> >
> > (I plan to rebase this on my series introducing the Bound device context [1].)
> >
> > I thought about this for a while and I still think TryFrom is fine here.
> 
> What reasoning do you have?

The concern in terms of abuse is that one could try to randomly guess the
"outer" device type (if any), which obiously indicates a fundamental design
issue.

But that's not specific to devices; it is a common anti-pattern in OOP to
randomly guess the subclass type of an object instance.

So, I don't think the situation here is really that special such that it needs
an extra highlight.

> > At some point I want to replace this implementation with a macro, since the code
> > is pretty similar for bus specific devices. I think that's a bit cleaner with
> > TryFrom compared to with a custom method, since we'd need the bus specific
> > device to call the macro from the generic impl, i.e.
> >
> > 	impl<Ctx: DeviceContext> Device<Ctx>
> >
> > rather than a specific one, which we can't control. We can control it for
> > TryFrom though.
> 
> We could have our own trait for that.

I don't think we should have a trait specific for devices for this. If we really
think the above anti-pattern deserves special attention, then we should have a
generic trait (e.g. FromSuper<T>) instead.

But I'm not sure that we really need to put special attention on that.
diff mbox series

Patch

diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs
index 67a2fc46cf4c..0bbc67edd38d 100644
--- a/rust/kernel/device.rs
+++ b/rust/kernel/device.rs
@@ -66,7 +66,6 @@  pub(crate) fn as_raw(&self) -> *mut bindings::device {
     }
 
     /// Returns a raw pointer to the device' bus type.
-    #[expect(unused)]
     pub(crate) fn bus_type_raw(&self) -> *const bindings::bus_type {
         // SAFETY:
         // - By the type invariants, `self.as_raw()` is a valid pointer to a `struct device`.
diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs
index 22a32172b108..8b7fcfe8bffc 100644
--- a/rust/kernel/pci.rs
+++ b/rust/kernel/pci.rs
@@ -6,7 +6,7 @@ 
 
 use crate::{
     alloc::flags::*,
-    bindings, device,
+    bindings, container_of, device,
     device_id::RawDeviceId,
     devres::Devres,
     driver,
@@ -20,7 +20,7 @@ 
 use core::{
     marker::PhantomData,
     ops::Deref,
-    ptr::{addr_of_mut, NonNull},
+    ptr::{addr_of, addr_of_mut, NonNull},
 };
 use kernel::prelude::*;
 
@@ -466,6 +466,27 @@  fn as_ref(&self) -> &device::Device {
     }
 }
 
+impl TryFrom<&device::Device> for &Device {
+    type Error = kernel::error::Error;
+
+    fn try_from(dev: &device::Device) -> Result<Self, Self::Error> {
+        #[allow(unused_unsafe)]
+        // SAFETY: rustc < 1.82 falsely treats this as unsafe, see:
+        // https://blog.rust-lang.org/2024/10/17/Rust-1.82.0.html#safely-addressing-unsafe-statics
+        if dev.bus_type_raw() != unsafe { addr_of!(bindings::pci_bus_type) } {
+            return Err(EINVAL);
+        }
+
+        // SAFETY: We've just verified that the bus type of `dev` equals `bindings::pci_bus_type`,
+        // hence `dev` must be embedded in a valid `struct pci_dev` as guaranteed by the
+        // corresponding C code.
+        let pdev = unsafe { container_of!(dev.as_raw(), bindings::pci_dev, dev) };
+
+        // SAFETY: `pdev` is a valid pointer to a `struct pci_dev`.
+        Ok(unsafe { &*pdev.cast() })
+    }
+}
+
 // SAFETY: A `Device` is always reference-counted and can be released from any thread.
 unsafe impl Send for Device {}