diff mbox series

[RFC] scsi: target: tcmu: running 32bit userspace on 64bit kernel

Message ID 364c13da-6d4f-c28b-36b3-082db8c3de58@ts.fujitsu.com (mailing list archive)
State Changes Requested
Headers show
Series [RFC] scsi: target: tcmu: running 32bit userspace on 64bit kernel | expand

Commit Message

Bodo Stroesser June 30, 2020, 4:49 p.m. UTC
Hi,

When using tcmu it might happen, that userspace application cannot be
built as 64 bit program even on a 64 bit host due to existing 32 bit
libraries that must be used, e.g. for compression, encryption,
deduplication, ...

Currently this only works with manual changes in userspace include
file target_core_user.h due to a missing padding field in
struct tcmu_cmd_entry.

Here are field offsets printed by a small program on a 64 bit host,
compiled as 64 bit program and as 32 bit:

Devel:~ # gcc -o print_offsets print_offsets.c
Devel:~ # ./print_offsets
Offset of tcmu_cmd_entry.hdr.len_op          = 0000
Offset of tcmu_cmd_entry.hdr.cmd_id          = 0004
Offset of tcmu_cmd_entry.hdr.kflags          = 0006
Offset of tcmu_cmd_entry.hdr.uflags          = 0007

Offset of tcmu_cmd_entry.req.iov_cnt         = 0008
Offset of tcmu_cmd_entry.req.iov_bidi_cnt    = 000c
Offset of tcmu_cmd_entry.req.iov_dif_cnt     = 0010
Offset of tcmu_cmd_entry.req.cdb_off         = 0018
Offset of tcmu_cmd_entry.req.iov[0]          = 0030

Offset of tcmu_cmd_entry.rsp.scsi_status     = 0008
Offset of tcmu_cmd_entry.rsp.read_len        = 000c
Offset of tcmu_cmd_entry.rsp.sense_buffer[0] = 0010

Size of struct tcmu_cmd_entry = 0070


Devel:~ # gcc -m32 -o print_offsets print_offsets.c
Devel:~ # ./print_offsets
Offset of tcmu_cmd_entry.hdr.len_op          = 0000
Offset of tcmu_cmd_entry.hdr.cmd_id          = 0004
Offset of tcmu_cmd_entry.hdr.kflags          = 0006
Offset of tcmu_cmd_entry.hdr.uflags          = 0007

Offset of tcmu_cmd_entry.req.iov_cnt         = 0008
Offset of tcmu_cmd_entry.req.iov_bidi_cnt    = 000c
Offset of tcmu_cmd_entry.req.iov_dif_cnt     = 0010
Offset of tcmu_cmd_entry.req.cdb_off         = 0014
Offset of tcmu_cmd_entry.req.iov[0]          = 002c

Offset of tcmu_cmd_entry.rsp.scsi_status     = 0008
Offset of tcmu_cmd_entry.rsp.read_len        = 000c
Offset of tcmu_cmd_entry.rsp.sense_buffer[0] = 0010

Size of struct tcmu_cmd_entry = 0070


The offset of the fields req.cdb_off and req.iov differ for 64-bit
and 32-bit compilation.

That means:
 - 64-bit application on 64-bit host works well
 - 32-bit application on 32-bit host works well
 - 32-bit application on 64-bit host fails.

Unfortunately I don't see a way to fix this problem such, that
32-bit application runs fine on 32-bit and 64-bit host without
breaking compatibility.

So I'm wondering whether the following change would be a viable
solution:



Using this change we can do:

Devel:~ # gcc -m32 -DAPPL32BIT_ON_KERNEL64BIT -o print_offsets print_offsets.c
Devel:~ # ./print_offsets
Offset of tcmu_cmd_entry.hdr.len_op          = 0000
Offset of tcmu_cmd_entry.hdr.cmd_id          = 0004
Offset of tcmu_cmd_entry.hdr.kflags          = 0006
Offset of tcmu_cmd_entry.hdr.uflags          = 0007

Offset of tcmu_cmd_entry.req.iov_cnt         = 0008
Offset of tcmu_cmd_entry.req.iov_bidi_cnt    = 000c
Offset of tcmu_cmd_entry.req.iov_dif_cnt     = 0010
Offset of tcmu_cmd_entry.req.cdb_off         = 0018
Offset of tcmu_cmd_entry.req.iov[0]          = 0030

Offset of tcmu_cmd_entry.rsp.scsi_status     = 0008
Offset of tcmu_cmd_entry.rsp.read_len        = 000c
Offset of tcmu_cmd_entry.rsp.sense_buffer[0] = 0010

Size of struct tcmu_cmd_entry = 0070


So we can compile a 32-bit application that works on 64-bit kernel without
need to manipulate the include file prepared by the kernel.

What do you think? Do you know a better solution?

Thank you,
Bodo

Comments

James Bottomley June 30, 2020, 4:52 p.m. UTC | #1
On Tue, 2020-06-30 at 18:49 +0200, Bodo Stroesser wrote:
> Hi,
> 
> When using tcmu it might happen, that userspace application cannot be
> built as 64 bit program even on a 64 bit host due to existing 32 bit
> libraries that must be used, e.g. for compression, encryption,
> deduplication, ...
> 
> Currently this only works with manual changes in userspace include
> file target_core_user.h due to a missing padding field in
> struct tcmu_cmd_entry.
> 
> Here are field offsets printed by a small program on a 64 bit host,
> compiled as 64 bit program and as 32 bit:
> 
> Devel:~ # gcc -o print_offsets print_offsets.c
> Devel:~ # ./print_offsets
> Offset of tcmu_cmd_entry.hdr.len_op          = 0000
> Offset of tcmu_cmd_entry.hdr.cmd_id          = 0004
> Offset of tcmu_cmd_entry.hdr.kflags          = 0006
> Offset of tcmu_cmd_entry.hdr.uflags          = 0007
> 
> Offset of tcmu_cmd_entry.req.iov_cnt         = 0008
> Offset of tcmu_cmd_entry.req.iov_bidi_cnt    = 000c
> Offset of tcmu_cmd_entry.req.iov_dif_cnt     = 0010
> Offset of tcmu_cmd_entry.req.cdb_off         = 0018
> Offset of tcmu_cmd_entry.req.iov[0]          = 0030
> 
> Offset of tcmu_cmd_entry.rsp.scsi_status     = 0008
> Offset of tcmu_cmd_entry.rsp.read_len        = 000c
> Offset of tcmu_cmd_entry.rsp.sense_buffer[0] = 0010
> 
> Size of struct tcmu_cmd_entry = 0070
> 
> 
> Devel:~ # gcc -m32 -o print_offsets print_offsets.c
> Devel:~ # ./print_offsets
> Offset of tcmu_cmd_entry.hdr.len_op          = 0000
> Offset of tcmu_cmd_entry.hdr.cmd_id          = 0004
> Offset of tcmu_cmd_entry.hdr.kflags          = 0006
> Offset of tcmu_cmd_entry.hdr.uflags          = 0007
> 
> Offset of tcmu_cmd_entry.req.iov_cnt         = 0008
> Offset of tcmu_cmd_entry.req.iov_bidi_cnt    = 000c
> Offset of tcmu_cmd_entry.req.iov_dif_cnt     = 0010
> Offset of tcmu_cmd_entry.req.cdb_off         = 0014
> Offset of tcmu_cmd_entry.req.iov[0]          = 002c
> 
> Offset of tcmu_cmd_entry.rsp.scsi_status     = 0008
> Offset of tcmu_cmd_entry.rsp.read_len        = 000c
> Offset of tcmu_cmd_entry.rsp.sense_buffer[0] = 0010
> 
> Size of struct tcmu_cmd_entry = 0070
> 
> 
> The offset of the fields req.cdb_off and req.iov differ for 64-bit
> and 32-bit compilation.
> 
> That means:
>  - 64-bit application on 64-bit host works well
>  - 32-bit application on 32-bit host works well
>  - 32-bit application on 64-bit host fails.
> 
> Unfortunately I don't see a way to fix this problem such, that
> 32-bit application runs fine on 32-bit and 64-bit host without
> breaking compatibility.
> 
> So I'm wondering whether the following change would be a viable
> solution:
> 
> diff --git a/include/uapi/linux/target_core_user.h
> b/include/uapi/linux/target_core_user.h
> --- a/include/uapi/linux/target_core_user.h
> +++ b/include/uapi/linux/target_core_user.h
> @@ -114,6 +114,9 @@ struct tcmu_cmd_entry {
>  			__u32 iov_cnt;
>  			__u32 iov_bidi_cnt;
>  			__u32 iov_dif_cnt;
> +#ifdef APPL32BIT_ON_KERNEL64BIT
> +			__u32 __pad9;
> +#endif
>  			__u64 cdb_off;
>  			__u64 __pad1;
>  			__u64 __pad2;
> 
> 
> Using this change we can do:
> 
> Devel:~ # gcc -m32 -DAPPL32BIT_ON_KERNEL64BIT -o print_offsets
> print_offsets.c
> Devel:~ # ./print_offsets
> Offset of tcmu_cmd_entry.hdr.len_op          = 0000
> Offset of tcmu_cmd_entry.hdr.cmd_id          = 0004
> Offset of tcmu_cmd_entry.hdr.kflags          = 0006
> Offset of tcmu_cmd_entry.hdr.uflags          = 0007
> 
> Offset of tcmu_cmd_entry.req.iov_cnt         = 0008
> Offset of tcmu_cmd_entry.req.iov_bidi_cnt    = 000c
> Offset of tcmu_cmd_entry.req.iov_dif_cnt     = 0010
> Offset of tcmu_cmd_entry.req.cdb_off         = 0018
> Offset of tcmu_cmd_entry.req.iov[0]          = 0030
> 
> Offset of tcmu_cmd_entry.rsp.scsi_status     = 0008
> Offset of tcmu_cmd_entry.rsp.read_len        = 000c
> Offset of tcmu_cmd_entry.rsp.sense_buffer[0] = 0010
> 
> Size of struct tcmu_cmd_entry = 0070
> 
> 
> So we can compile a 32-bit application that works on 64-bit kernel
> without need to manipulate the include file prepared by the kernel.
> 
> What do you think? Do you know a better solution?

Can you not use something similar to the compat_ioctl mechanism?  the
job of the compat layer is to re-layout the input and output structures
to impedance match between 32 and 64 bit.

James
Bodo Stroesser June 30, 2020, 5:17 p.m. UTC | #2
On 2020-06-30 18:52, James Bottomley wrote:
> On Tue, 2020-06-30 at 18:49 +0200, Bodo Stroesser wrote:
>> Hi,
>>
>> When using tcmu it might happen, that userspace application cannot be
>> built as 64 bit program even on a 64 bit host due to existing 32 bit
>> libraries that must be used, e.g. for compression, encryption,
>> deduplication, ...
>>
>> Currently this only works with manual changes in userspace include
>> file target_core_user.h due to a missing padding field in
>> struct tcmu_cmd_entry.
>>
>> Here are field offsets printed by a small program on a 64 bit host,
>> compiled as 64 bit program and as 32 bit:
>>
>> Devel:~ # gcc -o print_offsets print_offsets.c
>> Devel:~ # ./print_offsets
>> Offset of tcmu_cmd_entry.hdr.len_op          = 0000
>> Offset of tcmu_cmd_entry.hdr.cmd_id          = 0004
>> Offset of tcmu_cmd_entry.hdr.kflags          = 0006
>> Offset of tcmu_cmd_entry.hdr.uflags          = 0007
>>
>> Offset of tcmu_cmd_entry.req.iov_cnt         = 0008
>> Offset of tcmu_cmd_entry.req.iov_bidi_cnt    = 000c
>> Offset of tcmu_cmd_entry.req.iov_dif_cnt     = 0010
>> Offset of tcmu_cmd_entry.req.cdb_off         = 0018
>> Offset of tcmu_cmd_entry.req.iov[0]          = 0030
>>
>> Offset of tcmu_cmd_entry.rsp.scsi_status     = 0008
>> Offset of tcmu_cmd_entry.rsp.read_len        = 000c
>> Offset of tcmu_cmd_entry.rsp.sense_buffer[0] = 0010
>>
>> Size of struct tcmu_cmd_entry = 0070
>>
>>
>> Devel:~ # gcc -m32 -o print_offsets print_offsets.c
>> Devel:~ # ./print_offsets
>> Offset of tcmu_cmd_entry.hdr.len_op          = 0000
>> Offset of tcmu_cmd_entry.hdr.cmd_id          = 0004
>> Offset of tcmu_cmd_entry.hdr.kflags          = 0006
>> Offset of tcmu_cmd_entry.hdr.uflags          = 0007
>>
>> Offset of tcmu_cmd_entry.req.iov_cnt         = 0008
>> Offset of tcmu_cmd_entry.req.iov_bidi_cnt    = 000c
>> Offset of tcmu_cmd_entry.req.iov_dif_cnt     = 0010
>> Offset of tcmu_cmd_entry.req.cdb_off         = 0014
>> Offset of tcmu_cmd_entry.req.iov[0]          = 002c
>>
>> Offset of tcmu_cmd_entry.rsp.scsi_status     = 0008
>> Offset of tcmu_cmd_entry.rsp.read_len        = 000c
>> Offset of tcmu_cmd_entry.rsp.sense_buffer[0] = 0010
>>
>> Size of struct tcmu_cmd_entry = 0070
>>
>>
>> The offset of the fields req.cdb_off and req.iov differ for 64-bit
>> and 32-bit compilation.
>>
>> That means:
>>   - 64-bit application on 64-bit host works well
>>   - 32-bit application on 32-bit host works well
>>   - 32-bit application on 64-bit host fails.
>>
>> Unfortunately I don't see a way to fix this problem such, that
>> 32-bit application runs fine on 32-bit and 64-bit host without
>> breaking compatibility.
>>
>> So I'm wondering whether the following change would be a viable
>> solution:
>>
>> diff --git a/include/uapi/linux/target_core_user.h
>> b/include/uapi/linux/target_core_user.h
>> --- a/include/uapi/linux/target_core_user.h
>> +++ b/include/uapi/linux/target_core_user.h
>> @@ -114,6 +114,9 @@ struct tcmu_cmd_entry {
>>   			__u32 iov_cnt;
>>   			__u32 iov_bidi_cnt;
>>   			__u32 iov_dif_cnt;
>> +#ifdef APPL32BIT_ON_KERNEL64BIT
>> +			__u32 __pad9;
>> +#endif
>>   			__u64 cdb_off;
>>   			__u64 __pad1;
>>   			__u64 __pad2;
>>
>>
>> Using this change we can do:
>>
>> Devel:~ # gcc -m32 -DAPPL32BIT_ON_KERNEL64BIT -o print_offsets
>> print_offsets.c
>> Devel:~ # ./print_offsets
>> Offset of tcmu_cmd_entry.hdr.len_op          = 0000
>> Offset of tcmu_cmd_entry.hdr.cmd_id          = 0004
>> Offset of tcmu_cmd_entry.hdr.kflags          = 0006
>> Offset of tcmu_cmd_entry.hdr.uflags          = 0007
>>
>> Offset of tcmu_cmd_entry.req.iov_cnt         = 0008
>> Offset of tcmu_cmd_entry.req.iov_bidi_cnt    = 000c
>> Offset of tcmu_cmd_entry.req.iov_dif_cnt     = 0010
>> Offset of tcmu_cmd_entry.req.cdb_off         = 0018
>> Offset of tcmu_cmd_entry.req.iov[0]          = 0030
>>
>> Offset of tcmu_cmd_entry.rsp.scsi_status     = 0008
>> Offset of tcmu_cmd_entry.rsp.read_len        = 000c
>> Offset of tcmu_cmd_entry.rsp.sense_buffer[0] = 0010
>>
>> Size of struct tcmu_cmd_entry = 0070
>>
>>
>> So we can compile a 32-bit application that works on 64-bit kernel
>> without need to manipulate the include file prepared by the kernel.
>>
>> What do you think? Do you know a better solution?
> 
> Can you not use something similar to the compat_ioctl mechanism?  the
> job of the compat layer is to re-layout the input and output structures
> to impedance match between 32 and 64 bit.
> 
> James
> 

struct tcmu_cmd_entry is prepared by kernel and exposed to userspace via
an mmap()ed uio device. From tcmu module point of view it is vmalloc
area.

So there is no 'glue' code that could do conversion.

I'm not sure whether there is a way for tcmu to find out, what kind of
userspace application is running. If it would know, it probably could
prepare the entries accordingly.

But worst case it could happen that entries prepared for 32-bit, after
application stop and start, then are read by a new application version
that is 64-bit ...

Maybe we could add an configFS attribute allowing userspace to configure
tcmu accordingly. A change of this attribute could also flush existing
tcmu_cmd_entries. But that would mean that userspace application has to
take care.

Isn't the compiler definition easier to use?

Bodo
James Bottomley June 30, 2020, 5:27 p.m. UTC | #3
On Tue, 2020-06-30 at 19:17 +0200, Bodo Stroesser wrote:
> On 2020-06-30 18:52, James Bottomley wrote:
> > On Tue, 2020-06-30 at 18:49 +0200, Bodo Stroesser wrote:
[...]
> > > So we can compile a 32-bit application that works on 64-bit
> > > kernel without need to manipulate the include file prepared by
> > > the kernel.
> > > 
> > > What do you think? Do you know a better solution?
> > 
> > Can you not use something similar to the compat_ioctl
> > mechanism?  the job of the compat layer is to re-layout the input
> > and output structures to impedance match between 32 and 64 bit.
> > 
> > James
> > 
> 
> struct tcmu_cmd_entry is prepared by kernel and exposed to userspace
> via an mmap()ed uio device. From tcmu module point of view it is
> vmalloc area.
> 
> So there is no 'glue' code that could do conversion.

OK, so can't you do it like a vdso then?  For them we detect on mmap
what the architectural model is and map different areas depending on
that, so effectively you have a 64 and a 32 bit layout area and which
one you map depends on the architecture type you see coming into the
setup call.  You expect only one to be mapped and you only do extra
conversions on the impedance mismatch case.

> I'm not sure whether there is a way for tcmu to find out, what kind
> of userspace application is running. If it would know, it probably
> could prepare the entries accordingly.

I forget if the vdso switch is exposed outside the architecture, but an
inspection of the code should tell you.

> But worst case it could happen that entries prepared for 32-bit,
> after application stop and start, then are read by a new application
> version that is 64-bit ...

The userspace application absolutely knows its binary model.  The
kernel sort of knows (its known in the ELF32 execution model and the
like).

> Maybe we could add an configFS attribute allowing userspace to
> configure tcmu accordingly. A change of this attribute could also
> flush existing tcmu_cmd_entries. But that would mean that userspace
> application has to take care.
> 
> Isn't the compiler definition easier to use?

Well, yes, but that's an ABI change which I thought you were trying to
avoid?

James
Bodo Stroesser June 30, 2020, 6 p.m. UTC | #4
On 2020-06-30 19:27, James Bottomley wrote:
> On Tue, 2020-06-30 at 19:17 +0200, Bodo Stroesser wrote:
>> On 2020-06-30 18:52, James Bottomley wrote:
>>> On Tue, 2020-06-30 at 18:49 +0200, Bodo Stroesser wrote:
> [...]
>>>> So we can compile a 32-bit application that works on 64-bit
>>>> kernel without need to manipulate the include file prepared by
>>>> the kernel.
>>>>
>>>> What do you think? Do you know a better solution?
>>>
>>> Can you not use something similar to the compat_ioctl
>>> mechanism?  the job of the compat layer is to re-layout the input
>>> and output structures to impedance match between 32 and 64 bit.
>>>
>>> James
>>>
>>
>> struct tcmu_cmd_entry is prepared by kernel and exposed to userspace
>> via an mmap()ed uio device. From tcmu module point of view it is
>> vmalloc area.
>>
>> So there is no 'glue' code that could do conversion.
> 
> OK, so can't you do it like a vdso then?  For them we detect on mmap
> what the architectural model is and map different areas depending on
> that, so effectively you have a 64 and a 32 bit layout area and which
> one you map depends on the architecture type you see coming into the
> setup call.  You expect only one to be mapped and you only do extra
> conversions on the impedance mismatch case.

I think, vdso and tcmu's uio mmaps are different in vdso AFAIK being
constant code while the uio mmap contains data (e.g. tcmu_cmd_entries) 
that is written and read by tcmu module and userspace. So kernel needs 
to know what format of entries userspace expects.
Additionally, the data is preserved over close and re-open of the uio
device and mapping. So it would be necessary to re-format data existing
in a ring buffer according to the appl arch. I don't like this idea.
But alternatively on architecture change we could flush the ring buffer.

Hmm. tcmu is called during mmap() for the uio-device. If we can
determine the appl arch at that time, it would work. Let me check.

Bodo
diff mbox series

Patch

diff --git a/include/uapi/linux/target_core_user.h b/include/uapi/linux/target_core_user.h
--- a/include/uapi/linux/target_core_user.h
+++ b/include/uapi/linux/target_core_user.h
@@ -114,6 +114,9 @@  struct tcmu_cmd_entry {
 			__u32 iov_cnt;
 			__u32 iov_bidi_cnt;
 			__u32 iov_dif_cnt;
+#ifdef APPL32BIT_ON_KERNEL64BIT
+			__u32 __pad9;
+#endif
 			__u64 cdb_off;
 			__u64 __pad1;
 			__u64 __pad2;