diff mbox series

rust: str: Use `core::CStr`, remove the custom `CStr` implementation

Message ID 20240714160238.238708-1-vadorovsky@gmail.com (mailing list archive)
State New
Headers show
Series rust: str: Use `core::CStr`, remove the custom `CStr` implementation | expand

Commit Message

Michal Rostecki July 14, 2024, 4:02 p.m. UTC
`CStr` became a part of `core` library in Rust 1.75, therefore there is
no need to keep the custom implementation.

`core::CStr` behaves generally the same as the removed implementation,
with the following differences:

- It does not implement `Display` (but implements `Debug`).
- It does not provide `from_bytes_with_nul_unchecked_mut` method.
  - It was used only in `DerefMut` implementation for `CString`. This
    change replaces it with a manual cast to `&mut CStr`.
  - Otherwise, having such a method is not really desirable. `CStr` is
    a reference type
    or `str` are usually not supposed to be modified.
- It has `as_ptr()` method instead of `as_char_ptr()`, which also returns
  `*const c_char`.

Rust also introduces C literals (`c""`), so the `c_str` macro is removed
here as well.

Signed-off-by: Michal Rostecki <vadorovsky@gmail.com>
---
 rust/kernel/error.rs        |   7 +-
 rust/kernel/init.rs         |   8 +-
 rust/kernel/kunit.rs        |  16 +-
 rust/kernel/net/phy.rs      |   2 +-
 rust/kernel/prelude.rs      |   4 +-
 rust/kernel/str.rs          | 490 +-----------------------------------
 rust/kernel/sync.rs         |  13 +-
 rust/kernel/sync/condvar.rs |   5 +-
 rust/kernel/sync/lock.rs    |   6 +-
 rust/kernel/workqueue.rs    |  10 +-
 scripts/rustdoc_test_gen.rs |   4 +-
 11 files changed, 57 insertions(+), 508 deletions(-)

Comments

Björn Roy Baron July 14, 2024, 5:01 p.m. UTC | #1
On Sunday, July 14th, 2024 at 18:02, Michal Rostecki <vadorovsky@gmail.com> wrote:

> `CStr` became a part of `core` library in Rust 1.75, therefore there is
> no need to keep the custom implementation.
> 
> `core::CStr` behaves generally the same as the removed implementation,
> with the following differences:
> 
> - It does not implement `Display` (but implements `Debug`).
> - It does not provide `from_bytes_with_nul_unchecked_mut` method.
>   - It was used only in `DerefMut` implementation for `CString`. This
>     change replaces it with a manual cast to `&mut CStr`.
>   - Otherwise, having such a method is not really desirable. `CStr` is
>     a reference type
>     or `str` are usually not supposed to be modified.
> - It has `as_ptr()` method instead of `as_char_ptr()`, which also returns
>   `*const c_char`.
> 
> Rust also introduces C literals (`c""`), so the `c_str` macro is removed
> here as well.
> 
> Signed-off-by: Michal Rostecki <vadorovsky@gmail.com>
> ---
>  rust/kernel/error.rs        |   7 +-
>  rust/kernel/init.rs         |   8 +-
>  rust/kernel/kunit.rs        |  16 +-
>  rust/kernel/net/phy.rs      |   2 +-
>  rust/kernel/prelude.rs      |   4 +-
>  rust/kernel/str.rs          | 490 +-----------------------------------
>  rust/kernel/sync.rs         |  13 +-
>  rust/kernel/sync/condvar.rs |   5 +-
>  rust/kernel/sync/lock.rs    |   6 +-
>  rust/kernel/workqueue.rs    |  10 +-
>  scripts/rustdoc_test_gen.rs |   4 +-
>  11 files changed, 57 insertions(+), 508 deletions(-)
> 

[snip]

> diff --git a/rust/kernel/init.rs b/rust/kernel/init.rs
> index 68605b633e73..af0017e56c0e 100644
> --- a/rust/kernel/init.rs
> +++ b/rust/kernel/init.rs
> @@ -46,7 +46,7 @@
>  //! }
>  //!
>  //! let foo = pin_init!(Foo {
> -//!     a <- new_mutex!(42, "Foo::a"),
> +//!     a <- new_mutex!(42, c"Foo::a"),

That we need a CStr here seems a bit of an internal implementation detail. Maybe
keep accepting a regular string literal and converting it to a CStr internally?
If others think what you have here is fine, I don't it mind all that much though.

>  //!     b: 24,
>  //! });
>  //! ```

[snip]

> @@ -840,9 +375,10 @@ fn deref(&self) -> &Self::Target {
> 
>  impl DerefMut for CString {
>      fn deref_mut(&mut self) -> &mut Self::Target {
> -        // SAFETY: A `CString` is always NUL-terminated and contains no other
> -        // NUL bytes.
> -        unsafe { CStr::from_bytes_with_nul_unchecked_mut(self.buf.as_mut_slice()) }
> +        debug_assert!(!self.buf.is_empty() && self.buf[self.buf.len() - 1] == 0);
> +        // SAFETY: Casting to CStr is safe because its internal representation
> +        // is a [u8] too.
> +        unsafe { &mut *(self.buf.as_mut_slice() as *mut [u8] as *mut CStr) }

The documentation of CStr [1] is very clear that the layout of CStr is not guaranteed.

> Note that this structure does not have a guaranteed layout (the repr(transparent)
> notwithstanding) and is not recommended to be placed in the signatures of FFI
> functions. Instead, safe wrappers of FFI functions may leverage the unsafe
> CStr::from_ptr constructor to provide a safe interface to other consumers.

Is there any place where this DerefMut impl is actually used? If not it should probably
be removed. The liballoc version of CString doesn't have this impl either. (Can we use
the liballoc version of CString too just like this patch does for CStr?)

[snip]

Link: https://doc.rust-lang.org/stable/std/ffi/struct.CStr.html [1]
Miguel Ojeda July 14, 2024, 5:30 p.m. UTC | #2
Hi Michal,

Thanks for the patch! Some notes below...

On Sun, Jul 14, 2024 at 6:02 PM Michal Rostecki <vadorovsky@gmail.com> wrote:
>
> `CStr` became a part of `core` library in Rust 1.75, therefore there is
> no need to keep the custom implementation.

It would depend on the differences, right? i.e. for a reader, is this
meant to imply there is no meaningful difference in what you point out
below?

> - It does not implement `Display` (but implements `Debug`).

One question that comes up when reading this is: are we losing
`Display`'s output form?

Also, for clarity, please mention if there is a difference in the
output of the `Debug` ones.

>   - Otherwise, having such a method is not really desirable. `CStr` is
>     a reference type
>     or `str` are usually not supposed to be modified.

The sentence seems to be cut, and it should probably try to explain
better why it is undesirable, i.e. if it is needed for something like
`DerefMut`, then it seems better to have a method.

> -            static CONDITION: &'static $crate::str::CStr = $crate::c_str!(stringify!($condition));
> +            static CONDITION: &'static core::ffi::CStr = unsafe {
> +                core::ffi::CStr::from_bytes_with_nul_unchecked(
> +                    core::concat!(stringify!($condition), "\0").as_bytes()
> +                )
> +            };

This looks worse after the change and requires `unsafe`. Can we do
something to improve it?

> +        // SAFETY: Casting to CStr is safe because its internal representation
> +        // is a [u8] too.
> +        unsafe { &mut *(self.buf.as_mut_slice() as *mut [u8] as *mut CStr) }

I see Björn commented on this already -- `CStr`'s layout is not
guaranteed (and is a `[c_char]` instead).

Also, the casting is not what is unsafe, so perhaps it may be clearer
to reword the comment.

In addition, please format comments as Markdown.

> -//!             work <- new_work!("MyStruct::work"),
> +//!             work <- new_work!(c"MyStruct::work"),

I agree as well that it may make sense to simplify the callers as much
as possible, unless there is a need to have that flexibility.

Cheers,
Miguel
Michal Rostecki July 15, 2024, 3:46 p.m. UTC | #3
On 14.07.24 19:01, Björn Roy Baron wrote:
> On Sunday, July 14th, 2024 at 18:02, Michal Rostecki <vadorovsky@gmail.com> wrote:
> 
>> `CStr` became a part of `core` library in Rust 1.75, therefore there is
>> no need to keep the custom implementation.
>>
>> `core::CStr` behaves generally the same as the removed implementation,
>> with the following differences:
>>
>> - It does not implement `Display` (but implements `Debug`).
>> - It does not provide `from_bytes_with_nul_unchecked_mut` method.
>>    - It was used only in `DerefMut` implementation for `CString`. This
>>      change replaces it with a manual cast to `&mut CStr`.
>>    - Otherwise, having such a method is not really desirable. `CStr` is
>>      a reference type
>>      or `str` are usually not supposed to be modified.
>> - It has `as_ptr()` method instead of `as_char_ptr()`, which also returns
>>    `*const c_char`.
>>
>> Rust also introduces C literals (`c""`), so the `c_str` macro is removed
>> here as well.
>>
>> Signed-off-by: Michal Rostecki <vadorovsky@gmail.com>
>> ---
>>   rust/kernel/error.rs        |   7 +-
>>   rust/kernel/init.rs         |   8 +-
>>   rust/kernel/kunit.rs        |  16 +-
>>   rust/kernel/net/phy.rs      |   2 +-
>>   rust/kernel/prelude.rs      |   4 +-
>>   rust/kernel/str.rs          | 490 +-----------------------------------
>>   rust/kernel/sync.rs         |  13 +-
>>   rust/kernel/sync/condvar.rs |   5 +-
>>   rust/kernel/sync/lock.rs    |   6 +-
>>   rust/kernel/workqueue.rs    |  10 +-
>>   scripts/rustdoc_test_gen.rs |   4 +-
>>   11 files changed, 57 insertions(+), 508 deletions(-)
>>
> 
> [snip]
> 
>> diff --git a/rust/kernel/init.rs b/rust/kernel/init.rs
>> index 68605b633e73..af0017e56c0e 100644
>> --- a/rust/kernel/init.rs
>> +++ b/rust/kernel/init.rs
>> @@ -46,7 +46,7 @@
>>   //! }
>>   //!
>>   //! let foo = pin_init!(Foo {
>> -//!     a <- new_mutex!(42, "Foo::a"),
>> +//!     a <- new_mutex!(42, c"Foo::a"),
> 
> That we need a CStr here seems a bit of an internal implementation detail. Maybe
> keep accepting a regular string literal and converting it to a CStr internally?
> If others think what you have here is fine, I don't it mind all that much though.
> 

The names passed to `new_mutex`, `new_condvar`, `new_spinlock` etc. are 
immediately passed in the FFI calls (`__mutex_init`, 
`__init_waitqueue_head`, `__spin_lock_init`) [0][1][2]. In fact, I don't 
see any internal usage, where using Rust &str would be beneficial. Am I 
missing something?

Converting a &str to &CStr inside `Mutex::new` or `CondVar::new` would 
require allocating a new buffer, larger by 1, to include the nul byte. 
Doing that for every new mutex or condvar seems a bit wasteful to me.

[0] 
https://github.com/Rust-for-Linux/linux/blob/b1263411112305acf2af728728591465becb45b0/rust/kernel/sync/lock/mutex.rs#L104
[1] 
https://github.com/Rust-for-Linux/linux/blob/b1263411112305acf2af728728591465becb45b0/rust/kernel/sync/condvar.rs#L111
[2] 
https://github.com/Rust-for-Linux/linux/blob/b1263411112305acf2af728728591465becb45b0/rust/kernel/sync/lock/spinlock.rs#L103

>>   //!     b: 24,
>>   //! });
>>   //! ```
> 
> [snip]
> 
>> @@ -840,9 +375,10 @@ fn deref(&self) -> &Self::Target {
>>
>>   impl DerefMut for CString {
>>       fn deref_mut(&mut self) -> &mut Self::Target {
>> -        // SAFETY: A `CString` is always NUL-terminated and contains no other
>> -        // NUL bytes.
>> -        unsafe { CStr::from_bytes_with_nul_unchecked_mut(self.buf.as_mut_slice()) }
>> +        debug_assert!(!self.buf.is_empty() && self.buf[self.buf.len() - 1] == 0);
>> +        // SAFETY: Casting to CStr is safe because its internal representation
>> +        // is a [u8] too.
>> +        unsafe { &mut *(self.buf.as_mut_slice() as *mut [u8] as *mut CStr) }
> 
> The documentation of CStr [1] is very clear that the layout of CStr is not guaranteed.
> 
>> Note that this structure does not have a guaranteed layout (the repr(transparent)
>> notwithstanding) and is not recommended to be placed in the signatures of FFI
>> functions. Instead, safe wrappers of FFI functions may leverage the unsafe
>> CStr::from_ptr constructor to provide a safe interface to other consumers.
> 
> Is there any place where this DerefMut impl is actually used? If not it should probably
> be removed. The liballoc version of CString doesn't have this impl either. (Can we use
> the liballoc version of CString too just like this patch does for CStr?)
> 
> [snip]
> 
> Link: https://doc.rust-lang.org/stable/std/ffi/struct.CStr.html [1]

Good call. The `DerefMut` was not used anywhere, removing it works.
Björn Roy Baron July 15, 2024, 3:56 p.m. UTC | #4
On Monday, July 15th, 2024 at 17:46, Michal Rostecki <vadorovsky@gmail.com> wrote:

> On 14.07.24 19:01, Björn Roy Baron wrote:
> > On Sunday, July 14th, 2024 at 18:02, Michal Rostecki <vadorovsky@gmail.com> wrote:
> >
> >> `CStr` became a part of `core` library in Rust 1.75, therefore there is
> >> no need to keep the custom implementation.
> >>
> >> `core::CStr` behaves generally the same as the removed implementation,
> >> with the following differences:
> >>
> >> - It does not implement `Display` (but implements `Debug`).
> >> - It does not provide `from_bytes_with_nul_unchecked_mut` method.
> >>    - It was used only in `DerefMut` implementation for `CString`. This
> >>      change replaces it with a manual cast to `&mut CStr`.
> >>    - Otherwise, having such a method is not really desirable. `CStr` is
> >>      a reference type
> >>      or `str` are usually not supposed to be modified.
> >> - It has `as_ptr()` method instead of `as_char_ptr()`, which also returns
> >>    `*const c_char`.
> >>
> >> Rust also introduces C literals (`c""`), so the `c_str` macro is removed
> >> here as well.
> >>
> >> Signed-off-by: Michal Rostecki <vadorovsky@gmail.com>
> >> ---
> >>   rust/kernel/error.rs        |   7 +-
> >>   rust/kernel/init.rs         |   8 +-
> >>   rust/kernel/kunit.rs        |  16 +-
> >>   rust/kernel/net/phy.rs      |   2 +-
> >>   rust/kernel/prelude.rs      |   4 +-
> >>   rust/kernel/str.rs          | 490 +-----------------------------------
> >>   rust/kernel/sync.rs         |  13 +-
> >>   rust/kernel/sync/condvar.rs |   5 +-
> >>   rust/kernel/sync/lock.rs    |   6 +-
> >>   rust/kernel/workqueue.rs    |  10 +-
> >>   scripts/rustdoc_test_gen.rs |   4 +-
> >>   11 files changed, 57 insertions(+), 508 deletions(-)
> >>
> >
> > [snip]
> >
> >> diff --git a/rust/kernel/init.rs b/rust/kernel/init.rs
> >> index 68605b633e73..af0017e56c0e 100644
> >> --- a/rust/kernel/init.rs
> >> +++ b/rust/kernel/init.rs
> >> @@ -46,7 +46,7 @@
> >>   //! }
> >>   //!
> >>   //! let foo = pin_init!(Foo {
> >> -//!     a <- new_mutex!(42, "Foo::a"),
> >> +//!     a <- new_mutex!(42, c"Foo::a"),
> >
> > That we need a CStr here seems a bit of an internal implementation detail. Maybe
> > keep accepting a regular string literal and converting it to a CStr internally?
> > If others think what you have here is fine, I don't it mind all that much though.
> >
> 
> The names passed to `new_mutex`, `new_condvar`, `new_spinlock` etc. are
> immediately passed in the FFI calls (`__mutex_init`,
> `__init_waitqueue_head`, `__spin_lock_init`) [0][1][2]. In fact, I don't
> see any internal usage, where using Rust &str would be beneficial. Am I
> missing something?
> 
> Converting a &str to &CStr inside `Mutex::new` or `CondVar::new` would
> require allocating a new buffer, larger by 1, to include the nul byte.
> Doing that for every new mutex or condvar seems a bit wasteful to me.

The names passed to `new_mutex!` and such are literals known at
compile time. This means we can keep adding the nul terminator at
compile time without allocating any extra buffer. Basically just
adapting the current implementation of `optional_name!` to produce an
`core::ffi::&CStr` rather than a `kernel::str::CStr` from a regular
string literal is enough to avoid having to explicitly use C string
literals in those macro invocations. This way users don't need to
know that internally an `&CStr` is used.

> 
> [0]
> https://github.com/Rust-for-Linux/linux/blob/b1263411112305acf2af728728591465becb45b0/rust/kernel/sync/lock/mutex.rs#L104
> [1]
> https://github.com/Rust-for-Linux/linux/blob/b1263411112305acf2af728728591465becb45b0/rust/kernel/sync/condvar.rs#L111
> [2]
> https://github.com/Rust-for-Linux/linux/blob/b1263411112305acf2af728728591465becb45b0/rust/kernel/sync/lock/spinlock.rs#L103

[snip]
Michal Rostecki July 15, 2024, 4:12 p.m. UTC | #5
On 14.07.24 19:30, Miguel Ojeda wrote:
> Hi Michal,
> 
> Thanks for the patch! Some notes below...
> 
> On Sun, Jul 14, 2024 at 6:02 PM Michal Rostecki <vadorovsky@gmail.com> wrote:
>>
>> `CStr` became a part of `core` library in Rust 1.75, therefore there is
>> no need to keep the custom implementation.
> 
> It would depend on the differences, right? i.e. for a reader, is this
> meant to imply there is no meaningful difference in what you point out
> below?
> 

Alright, I will remove the second part of the sentence.

>> - It does not implement `Display` (but implements `Debug`).
> 
> One question that comes up when reading this is: are we losing
> `Display`'s output form?
> 

Yes, we are losing the `Display` trait implementation by switching to 
`core::ffi::CStr`.

I was thinking whether I should keep `kernel::str::CStr` as a wrapper, 
just to keep the `Display` implementation. I could still do that if you 
want. I'm also open for other solutions.

The reason why I decided to not do that and go ahead without `Display` 
is that it was used only in rust/kernel/kunit.rs inside `kunit_assert`, 
for formatting the file and path the error message. This diff:

@@ -71,11 +75,11 @@ macro_rules! kunit_assert {
                  //
                  // This mimics KUnit's failed assertion format.
                  $crate::kunit::err(format_args!(
-                    "    # {}: ASSERTION FAILED at {FILE}:{LINE}\n",
+                    "    # {:?}: ASSERTION FAILED at {FILE:?}:{LINE:?}\n",
                      $name
                  ));
                  $crate::kunit::err(format_args!(
-                    "    Expected {CONDITION} to be true, but is false\n"
+                    "    Expected {CONDITION:?} to be true, but is false\n"
                  ));

The only practical difference in switching from `Display` to `Debug` 
here is that the fallback kunit error messages are going to include 
quotation marks around conditions, files and lines.

> Also, for clarity, please mention if there is a difference in the
> output of the `Debug` ones.
> 

There isn't any difference, I will mention that.

>>    - Otherwise, having such a method is not really desirable. `CStr` is
>>      a reference type
>>      or `str` are usually not supposed to be modified.
> 
> The sentence seems to be cut, and it should probably try to explain
> better why it is undesirable, i.e. if it is needed for something like
> `DerefMut`, then it seems better to have a method.
> 

Regarding `DerefMut` implementation for `CString` - we don't need it. Or 
at least - removing it (after my CStr patch), does not break anything. 
If that's fine for you, I'm going to remove it in v2 all together.

About why having `&mut CStr` is undesirable - I will try to find better 
wording. My general point is that I've never seen `&mut str` being 
exposed in any core/std API to the external user, mutation usually 
implies usage of an owned String.

>> -            static CONDITION: &'static $crate::str::CStr = $crate::c_str!(stringify!($condition));
>> +            static CONDITION: &'static core::ffi::CStr = unsafe {
>> +                core::ffi::CStr::from_bytes_with_nul_unchecked(
>> +                    core::concat!(stringify!($condition), "\0").as_bytes()
>> +                )
>> +            };
> 
> This looks worse after the change and requires `unsafe`. Can we do
> something to improve it?
> 

I think the best solution would be leaving `c_str` macro for that. The 
reason why I removed it is that the GitHub issue[0] mentions its 
removal. But for that case, I think it makes sense to leave it. What do 
you think?

[0] https://github.com/Rust-for-Linux/linux/issues/1075

>> +        // SAFETY: Casting to CStr is safe because its internal representation
>> +        // is a [u8] too.
>> +        unsafe { &mut *(self.buf.as_mut_slice() as *mut [u8] as *mut CStr) }
> 
> I see Björn commented on this already -- `CStr`'s layout is not
> guaranteed (and is a `[c_char]` instead).
> 
> Also, the casting is not what is unsafe, so perhaps it may be clearer
> to reword the comment.
> 
> In addition, please format comments as Markdown.
> 

Good point, I will fix the comment.

>> -//!             work <- new_work!("MyStruct::work"),
>> +//!             work <- new_work!(c"MyStruct::work"),
> 
> I agree as well that it may make sense to simplify the callers as much
> as possible, unless there is a need to have that flexibility.
> 

I already replied to Björn - names passed to `new_work!`, `new_mutex!` 
are immediatelly passed to FFI calls and are not used in the Rust code 
internally, so I prefer to keep them as C strings rather than Rust strings.

> Cheers,
> Miguel


Cheers,
Michal
Michal Rostecki July 15, 2024, 4:15 p.m. UTC | #6
On 15.07.24 17:56, Björn Roy Baron wrote:
> On Monday, July 15th, 2024 at 17:46, Michal Rostecki <vadorovsky@gmail.com> wrote:
> 
>> On 14.07.24 19:01, Björn Roy Baron wrote:
>>> On Sunday, July 14th, 2024 at 18:02, Michal Rostecki <vadorovsky@gmail.com> wrote:
>>>
>>>> `CStr` became a part of `core` library in Rust 1.75, therefore there is
>>>> no need to keep the custom implementation.
>>>>
>>>> `core::CStr` behaves generally the same as the removed implementation,
>>>> with the following differences:
>>>>
>>>> - It does not implement `Display` (but implements `Debug`).
>>>> - It does not provide `from_bytes_with_nul_unchecked_mut` method.
>>>>     - It was used only in `DerefMut` implementation for `CString`. This
>>>>       change replaces it with a manual cast to `&mut CStr`.
>>>>     - Otherwise, having such a method is not really desirable. `CStr` is
>>>>       a reference type
>>>>       or `str` are usually not supposed to be modified.
>>>> - It has `as_ptr()` method instead of `as_char_ptr()`, which also returns
>>>>     `*const c_char`.
>>>>
>>>> Rust also introduces C literals (`c""`), so the `c_str` macro is removed
>>>> here as well.
>>>>
>>>> Signed-off-by: Michal Rostecki <vadorovsky@gmail.com>
>>>> ---
>>>>    rust/kernel/error.rs        |   7 +-
>>>>    rust/kernel/init.rs         |   8 +-
>>>>    rust/kernel/kunit.rs        |  16 +-
>>>>    rust/kernel/net/phy.rs      |   2 +-
>>>>    rust/kernel/prelude.rs      |   4 +-
>>>>    rust/kernel/str.rs          | 490 +-----------------------------------
>>>>    rust/kernel/sync.rs         |  13 +-
>>>>    rust/kernel/sync/condvar.rs |   5 +-
>>>>    rust/kernel/sync/lock.rs    |   6 +-
>>>>    rust/kernel/workqueue.rs    |  10 +-
>>>>    scripts/rustdoc_test_gen.rs |   4 +-
>>>>    11 files changed, 57 insertions(+), 508 deletions(-)
>>>>
>>>
>>> [snip]
>>>
>>>> diff --git a/rust/kernel/init.rs b/rust/kernel/init.rs
>>>> index 68605b633e73..af0017e56c0e 100644
>>>> --- a/rust/kernel/init.rs
>>>> +++ b/rust/kernel/init.rs
>>>> @@ -46,7 +46,7 @@
>>>>    //! }
>>>>    //!
>>>>    //! let foo = pin_init!(Foo {
>>>> -//!     a <- new_mutex!(42, "Foo::a"),
>>>> +//!     a <- new_mutex!(42, c"Foo::a"),
>>>
>>> That we need a CStr here seems a bit of an internal implementation detail. Maybe
>>> keep accepting a regular string literal and converting it to a CStr internally?
>>> If others think what you have here is fine, I don't it mind all that much though.
>>>
>>
>> The names passed to `new_mutex`, `new_condvar`, `new_spinlock` etc. are
>> immediately passed in the FFI calls (`__mutex_init`,
>> `__init_waitqueue_head`, `__spin_lock_init`) [0][1][2]. In fact, I don't
>> see any internal usage, where using Rust &str would be beneficial. Am I
>> missing something?
>>
>> Converting a &str to &CStr inside `Mutex::new` or `CondVar::new` would
>> require allocating a new buffer, larger by 1, to include the nul byte.
>> Doing that for every new mutex or condvar seems a bit wasteful to me.
> 
> The names passed to `new_mutex!` and such are literals known at
> compile time. This means we can keep adding the nul terminator at
> compile time without allocating any extra buffer. Basically just
> adapting the current implementation of `optional_name!` to produce an
> `core::ffi::&CStr` rather than a `kernel::str::CStr` from a regular
> string literal is enough to avoid having to explicitly use C string
> literals in those macro invocations. This way users don't need to
> know that internally an `&CStr` is used.
> 

OK, good point, I can indeed handle that in `optional_name!`.

>>
>> [0]
>> https://github.com/Rust-for-Linux/linux/blob/b1263411112305acf2af728728591465becb45b0/rust/kernel/sync/lock/mutex.rs#L104
>> [1]
>> https://github.com/Rust-for-Linux/linux/blob/b1263411112305acf2af728728591465becb45b0/rust/kernel/sync/condvar.rs#L111
>> [2]
>> https://github.com/Rust-for-Linux/linux/blob/b1263411112305acf2af728728591465becb45b0/rust/kernel/sync/lock/spinlock.rs#L103
> 
> [snip]
Miguel Ojeda July 16, 2024, 7:44 a.m. UTC | #7
On Mon, Jul 15, 2024 at 6:12 PM Michal Rostecki <vadorovsky@gmail.com> wrote:
>
> @@ -71,11 +75,11 @@ macro_rules! kunit_assert {
>                   //
>                   // This mimics KUnit's failed assertion format.
>                   $crate::kunit::err(format_args!(
> -                    "    # {}: ASSERTION FAILED at {FILE}:{LINE}\n",
> +                    "    # {:?}: ASSERTION FAILED at {FILE:?}:{LINE:?}\n",
>                       $name
>                   ));
>                   $crate::kunit::err(format_args!(
> -                    "    Expected {CONDITION} to be true, but is false\n"
> +                    "    Expected {CONDITION:?} to be true, but is false\n"
>                   ));
>
> The only practical difference in switching from `Display` to `Debug`
> here is that the fallback kunit error messages are going to include
> quotation marks around conditions, files and lines.

That is a fairly important difference -- the messages are intended to
match the C KUnit ones.

Especially the file:line notation -- I don't think a user would expect
to have quotes there (regardless of KUnit).

In general, even if we didn't need it right now, I think it is
something we will need sooner or later.

> wording. My general point is that I've never seen `&mut str` being
> exposed in any core/std API to the external user, mutation usually
> implies usage of an owned String.

Not sure what you mean by exposed, but even if `&mut str`'s methods do
not count (used via `String`), there is also
`from_utf8_unchecked_mut()` that returns one, which is essentially the
same idea as what we had here.

So I am not sure about the "The rule of Rust std" part in the new
commit messages. And, to be clear, while the Rust standard library is
a good reference to look into, sometimes we want/need to do things
differently anyway (which is not really the case here given
`from_utf8_unchecked_mut()`, I would say).

In this case, regardless of the standard library, personally I would
have preferred to have a non-public function, but still have it (and
properly documented), rather than open code the `unsafe` block with
the casts.

> I think the best solution would be leaving `c_str` macro for that. The
> reason why I removed it is that the GitHub issue[0] mentions its
> removal. But for that case, I think it makes sense to leave it. What do
> you think?

Perhaps the issue was only taking into account the C string literal
case? Benno may know more.

Generally speaking, replacing a clean line with a bigger `unsafe`
block is something to be avoided.

Maybe a `c_stringify!` is what we need :)

Cheers,
Miguel
Michal Rostecki July 17, 2024, 3:22 p.m. UTC | #8
On 16.07.24 09:44, Miguel Ojeda wrote:
> On Mon, Jul 15, 2024 at 6:12 PM Michal Rostecki <vadorovsky@gmail.com> wrote:
>>
>> @@ -71,11 +75,11 @@ macro_rules! kunit_assert {
>>                    //
>>                    // This mimics KUnit's failed assertion format.
>>                    $crate::kunit::err(format_args!(
>> -                    "    # {}: ASSERTION FAILED at {FILE}:{LINE}\n",
>> +                    "    # {:?}: ASSERTION FAILED at {FILE:?}:{LINE:?}\n",
>>                        $name
>>                    ));
>>                    $crate::kunit::err(format_args!(
>> -                    "    Expected {CONDITION} to be true, but is false\n"
>> +                    "    Expected {CONDITION:?} to be true, but is false\n"
>>                    ));
>>
>> The only practical difference in switching from `Display` to `Debug`
>> here is that the fallback kunit error messages are going to include
>> quotation marks around conditions, files and lines.
> 
> That is a fairly important difference -- the messages are intended to
> match the C KUnit ones.
> 
> Especially the file:line notation -- I don't think a user would expect
> to have quotes there (regardless of KUnit).
> 
> In general, even if we didn't need it right now, I think it is
> something we will need sooner or later.
> 

Alright, I will go with Trevor's suggestion and provide a `display()` 
method via an extension trait.

>> wording. My general point is that I've never seen `&mut str` being
>> exposed in any core/std API to the external user, mutation usually
>> implies usage of an owned String.
> 
> Not sure what you mean by exposed, but even if `&mut str`'s methods do
> not count (used via `String`), there is also
> `from_utf8_unchecked_mut()` that returns one, which is essentially the
> same idea as what we had here.
> 
> So I am not sure about the "The rule of Rust std" part in the new
> commit messages. And, to be clear, while the Rust standard library is
> a good reference to look into, sometimes we want/need to do things
> differently anyway (which is not really the case here given
> `from_utf8_unchecked_mut()`, I would say).
> 
> In this case, regardless of the standard library, personally I would
> have preferred to have a non-public function, but still have it (and
> properly documented), rather than open code the `unsafe` block with
> the casts.
> 

Fair enough. I will provide `from_utf8_unchecked_mut()` as a part of 
`CStrExt` in the next version.

>> I think the best solution would be leaving `c_str` macro for that. The
>> reason why I removed it is that the GitHub issue[0] mentions its
>> removal. But for that case, I think it makes sense to leave it. What do
>> you think?
> 
> Perhaps the issue was only taking into account the C string literal
> case? Benno may know more.
> 
> Generally speaking, replacing a clean line with a bigger `unsafe`
> block is something to be avoided.
> 
> Maybe a `c_stringify!` is what we need :)
> 

`stringify!` is not the only case where I ended up using `c_str!`. After 
addressing Björn's suggestion about taking Rust strings as arguments in 
`new_mutex!`, `new_condvar!` etc., `optional_name!` is also using 
`c_str!` in the following way:

   macro_rules! optional_name {
       () => {
           $crate::c_str!(::core::concat!(::core::file!(), ":", 
::core::line!()))
       };
       ($name:literal) => {
           $crate::c_str!($name)
       };
   }


So I think that leaving `c_str!` still makes sense, unless you have 
other suggestions, which are still easily applicable there. :)

> Cheers,
> Miguel

Cheers,
Michal
diff mbox series

Patch

diff --git a/rust/kernel/error.rs b/rust/kernel/error.rs
index 55280ae9fe40..18808b29604d 100644
--- a/rust/kernel/error.rs
+++ b/rust/kernel/error.rs
@@ -4,10 +4,11 @@ 
 //!
 //! C header: [`include/uapi/asm-generic/errno-base.h`](srctree/include/uapi/asm-generic/errno-base.h)
 
-use crate::{alloc::AllocError, str::CStr};
+use crate::alloc::AllocError;
 
 use alloc::alloc::LayoutError;
 
+use core::ffi::CStr;
 use core::fmt;
 use core::num::TryFromIntError;
 use core::str::Utf8Error;
@@ -142,7 +143,7 @@  pub fn name(&self) -> Option<&'static CStr> {
             None
         } else {
             // SAFETY: The string returned by `errname` is static and `NUL`-terminated.
-            Some(unsafe { CStr::from_char_ptr(ptr) })
+            Some(unsafe { CStr::from_ptr(ptr) })
         }
     }
 
@@ -164,7 +165,7 @@  fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
             None => f.debug_tuple("Error").field(&-self.0).finish(),
             // SAFETY: These strings are ASCII-only.
             Some(name) => f
-                .debug_tuple(unsafe { core::str::from_utf8_unchecked(name) })
+                .debug_tuple(unsafe { core::str::from_utf8_unchecked(name.to_bytes()) })
                 .finish(),
         }
     }
diff --git a/rust/kernel/init.rs b/rust/kernel/init.rs
index 68605b633e73..af0017e56c0e 100644
--- a/rust/kernel/init.rs
+++ b/rust/kernel/init.rs
@@ -46,7 +46,7 @@ 
 //! }
 //!
 //! let foo = pin_init!(Foo {
-//!     a <- new_mutex!(42, "Foo::a"),
+//!     a <- new_mutex!(42, c"Foo::a"),
 //!     b: 24,
 //! });
 //! ```
@@ -65,7 +65,7 @@ 
 //! #     b: u32,
 //! # }
 //! # let foo = pin_init!(Foo {
-//! #     a <- new_mutex!(42, "Foo::a"),
+//! #     a <- new_mutex!(42, c"Foo::a"),
 //! #     b: 24,
 //! # });
 //! let foo: Result<Pin<Box<Foo>>> = Box::pin_init(foo, GFP_KERNEL);
@@ -81,7 +81,7 @@ 
 //! ```rust
 //! # use kernel::sync::{new_mutex, Arc, Mutex};
 //! let mtx: Result<Arc<Mutex<usize>>> =
-//!     Arc::pin_init(new_mutex!(42, "example::mtx"), GFP_KERNEL);
+//!     Arc::pin_init(new_mutex!(42, c"example::mtx"), GFP_KERNEL);
 //! ```
 //!
 //! To declare an init macro/function you just return an [`impl PinInit<T, E>`]:
@@ -99,7 +99,7 @@ 
 //! impl DriverData {
 //!     fn new() -> impl PinInit<Self, Error> {
 //!         try_pin_init!(Self {
-//!             status <- new_mutex!(0, "DriverData::status"),
+//!             status <- new_mutex!(0, c"DriverData::status"),
 //!             buffer: Box::init(kernel::init::zeroed(), GFP_KERNEL)?,
 //!         })
 //!     }
diff --git a/rust/kernel/kunit.rs b/rust/kernel/kunit.rs
index 0ba77276ae7e..6da0d5e237b6 100644
--- a/rust/kernel/kunit.rs
+++ b/rust/kernel/kunit.rs
@@ -56,9 +56,13 @@  macro_rules! kunit_assert {
                 break 'out;
             }
 
-            static FILE: &'static $crate::str::CStr = $crate::c_str!($file);
+            static FILE: &'static core::ffi::CStr = $file;
             static LINE: i32 = core::line!() as i32 - $diff;
-            static CONDITION: &'static $crate::str::CStr = $crate::c_str!(stringify!($condition));
+            static CONDITION: &'static core::ffi::CStr = unsafe {
+                core::ffi::CStr::from_bytes_with_nul_unchecked(
+                    core::concat!(stringify!($condition), "\0").as_bytes()
+                )
+            };
 
             // SAFETY: FFI call without safety requirements.
             let kunit_test = unsafe { $crate::bindings::kunit_get_current_test() };
@@ -71,11 +75,11 @@  macro_rules! kunit_assert {
                 //
                 // This mimics KUnit's failed assertion format.
                 $crate::kunit::err(format_args!(
-                    "    # {}: ASSERTION FAILED at {FILE}:{LINE}\n",
+                    "    # {:?}: ASSERTION FAILED at {FILE:?}:{LINE:?}\n",
                     $name
                 ));
                 $crate::kunit::err(format_args!(
-                    "    Expected {CONDITION} to be true, but is false\n"
+                    "    Expected {CONDITION:?} to be true, but is false\n"
                 ));
                 $crate::kunit::err(format_args!(
                     "    Failure not reported to KUnit since this is a non-KUnit task\n"
@@ -98,12 +102,12 @@  unsafe impl Sync for Location {}
             unsafe impl Sync for UnaryAssert {}
 
             static LOCATION: Location = Location($crate::bindings::kunit_loc {
-                file: FILE.as_char_ptr(),
+                file: FILE.as_ptr(),
                 line: LINE,
             });
             static ASSERTION: UnaryAssert = UnaryAssert($crate::bindings::kunit_unary_assert {
                 assert: $crate::bindings::kunit_assert {},
-                condition: CONDITION.as_char_ptr(),
+                condition: CONDITION.as_ptr(),
                 expected_true: true,
             });
 
diff --git a/rust/kernel/net/phy.rs b/rust/kernel/net/phy.rs
index fd40b703d224..19f45922ec42 100644
--- a/rust/kernel/net/phy.rs
+++ b/rust/kernel/net/phy.rs
@@ -502,7 +502,7 @@  unsafe impl Sync for DriverVTable {}
 pub const fn create_phy_driver<T: Driver>() -> DriverVTable {
     // INVARIANT: All the fields of `struct phy_driver` are initialized properly.
     DriverVTable(Opaque::new(bindings::phy_driver {
-        name: T::NAME.as_char_ptr().cast_mut(),
+        name: T::NAME.as_ptr().cast_mut(),
         flags: T::FLAGS,
         phy_id: T::PHY_DEVICE_ID.id,
         phy_id_mask: T::PHY_DEVICE_ID.mask_as_int(),
diff --git a/rust/kernel/prelude.rs b/rust/kernel/prelude.rs
index b37a0b3180fb..5efabfaa5804 100644
--- a/rust/kernel/prelude.rs
+++ b/rust/kernel/prelude.rs
@@ -12,7 +12,7 @@ 
 //! ```
 
 #[doc(no_inline)]
-pub use core::pin::Pin;
+pub use core::{ffi::CStr, pin::Pin};
 
 pub use crate::alloc::{box_ext::BoxExt, flags::*, vec_ext::VecExt};
 
@@ -35,7 +35,7 @@ 
 
 pub use super::error::{code::*, Error, Result};
 
-pub use super::{str::CStr, ThisModule};
+pub use super::{ThisModule};
 
 pub use super::init::{InPlaceInit, Init, PinInit};
 
diff --git a/rust/kernel/str.rs b/rust/kernel/str.rs
index bb8d4f41475b..d1e6335a02a3 100644
--- a/rust/kernel/str.rs
+++ b/rust/kernel/str.rs
@@ -4,8 +4,9 @@ 
 
 use crate::alloc::{flags::*, vec_ext::VecExt, AllocError};
 use alloc::vec::Vec;
+use core::ffi::CStr;
 use core::fmt::{self, Write};
-use core::ops::{self, Deref, DerefMut, Index};
+use core::ops::{Deref, DerefMut};
 
 use crate::error::{code::*, Error};
 
@@ -41,11 +42,11 @@  impl fmt::Display for BStr {
     /// # use kernel::{fmt, b_str, str::{BStr, CString}};
     /// let ascii = b_str!("Hello, BStr!");
     /// let s = CString::try_from_fmt(fmt!("{}", ascii)).unwrap();
-    /// assert_eq!(s.as_bytes(), "Hello, BStr!".as_bytes());
+    /// assert_eq!(s.to_bytes(), "Hello, BStr!".as_bytes());
     ///
     /// let non_ascii = b_str!("