diff mbox

[V9fs-developer] v9fs unaligned access.

Message ID 55E6F6E9.20409@landley.net (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Rob Landley Sept. 2, 2015, 1:17 p.m. UTC
From: Rob Landley <rob@landley.net>

The v9fs driver code was making an unaligned read.

Signed-off-by: Rob Landley <rob@landley.net>
---

 net/9p/trans_fd.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Heads up, not sure this is the right fix for sparse's tag checking?


------------------------------------------------------------------------------
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140

Comments

Dominique Martinet Sept. 2, 2015, 2:10 p.m. UTC | #1
Rob Landley wrote on Wed, Sep 02, 2015 at 08:17:29AM -0500:
> The v9fs driver code was making an unaligned read.

What's wrong with unaligned reads? memcpy should do pretty much the
same? (Just curious, I guess that if you fixed that it must mean
something!)

>  net/9p/trans_fd.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> Heads up, not sure this is the right fix for sparse's tag checking?

What do you mean by "sparse's tag checking" ?
For what it's worth, rdma does a p9_parse_header, which does a
p9pdu_readf with appropriate arguments. That sounds cleaner to me, and
it'll also take care of the size read just above as le32.


> -		tag = le16_to_cpu(*(__le16 *) (m->rbuf+5)); /* read tag */
> +		memcpy(tag, m->rbuf+5, sizeof(tag)); /* read tag, unaligned */
> +		tag = le16_to_cpu(*(__le16 *)tag);

BTW I've checked and we do encode in le16 when we send, but I'm pretty
sure the protocol specifies native endianness. Even in the
implementation, it doesn't matter to servers if we use a tag 0x0100 or
0x0001 and will send the same tag back.

Do we need to bother with these le16/32/64 conversions?
(both/either in general and here)
Rob Landley Sept. 2, 2015, 3:40 p.m. UTC | #2
On 09/02/2015 09:10 AM, Dominique Martinet wrote:
> Rob Landley wrote on Wed, Sep 02, 2015 at 08:17:29AM -0500:
>> The v9fs driver code was making an unaligned read.
> 
> What's wrong with unaligned reads? memcpy should do pretty much the
> same? (Just curious, I guess that if you fixed that it must mean
> something!)

https://www.kernel.org/doc/Documentation/unaligned-memory-access.txt

>>  net/9p/trans_fd.c |    3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> Heads up, not sure this is the right fix for sparse's tag checking?
> 
> What do you mean by "sparse's tag checking" ?

https://sparse.wiki.kernel.org/index.php/Main_Page

I'm assuming the __le16 thing is a sparse tag.

> For what it's worth, rdma does a p9_parse_header, which does a
> p9pdu_readf with appropriate arguments. That sounds cleaner to me, and
> it'll also take care of the size read just above as le32.

I'm happy to have it fixed a different way. The problemm is the +5 to
the address, that's 1 byte alignment so you can only read bytes from
that on a processor that doesn't internally do alignment fixups.

>> -		tag = le16_to_cpu(*(__le16 *) (m->rbuf+5)); /* read tag */
>> +		memcpy(tag, m->rbuf+5, sizeof(tag)); /* read tag, unaligned */
>> +		tag = le16_to_cpu(*(__le16 *)tag);
> 
> BTW I've checked and we do encode in le16 when we send, but I'm pretty
> sure the protocol specifies native endianness. Even in the
> implementation, it doesn't matter to servers if we use a tag 0x0100 or
> 0x0001 and will send the same tag back.

It's still doing an le16_to_cpu, it's just not doing it directly from
m->rbuf+5.

> Do we need to bother with these le16/32/64 conversions?
> (both/either in general and here)

I assume so? I'm not trying to change the data going across the wire and
a number of platforms (ppc, mipseb, sh2...) are big endian.

Rob

------------------------------------------------------------------------------
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140
Dominique Martinet Sept. 2, 2015, 4 p.m. UTC | #3
Rob Landley wrote on Wed, Sep 02, 2015 at 10:40:27AM -0500:
> https://www.kernel.org/doc/Documentation/unaligned-memory-access.txt

Thanks, I vaguely remembered something like that.

> > What do you mean by "sparse's tag checking" ?
> 
> https://sparse.wiki.kernel.org/index.php/Main_Page
> 
> I'm assuming the __le16 thing is a sparse tag.

Ok, I was thinking of sparse as in sparse file e.g. not-contigous and it
didn't make much sense.
I'm still not quite sure why the __le16 would be a sparse tag? As you
correctly state underneath it's an endianess conversion function?
I checked out the sparse tree and can't seem to understand how it
relates.

> 
> > For what it's worth, rdma does a p9_parse_header, which does a
> > p9pdu_readf with appropriate arguments. That sounds cleaner to me, and
> > it'll also take care of the size read just above as le32.
> 
> I'm happy to have it fixed a different way. The problemm is the +5 to
> the address, that's 1 byte alignment so you can only read bytes from
> that on a processor that doesn't internally do alignment fixups.

p9_parse_header will do a memcpy ultimately, so it should be safe to
use. I'll submit something tomorrow if you don't want/aren't comfortable
doing it.

> > BTW I've checked and we do encode in le16 when we send, but I'm pretty
> > sure the protocol specifies native endianness. Even in the
> > implementation, it doesn't matter to servers if we use a tag 0x0100 or
> > 0x0001 and will send the same tag back.
> 
> It's still doing an le16_to_cpu, it's just not doing it directly from
> m->rbuf+5.
>
> > Do we need to bother with these le16/32/64 conversions?
> > (both/either in general and here)
> 
> I assume so? I'm not trying to change the data going across the wire and
> a number of platforms (ppc, mipseb, sh2...) are big endian.


Yes, just trying to raise a compltely different point - but it looks
like my memory is wrong anyway and I can find references saying it
should be little endian, so that answers my question.
Rob Landley Sept. 2, 2015, 11:52 p.m. UTC | #4
On 09/02/2015 11:00 AM, Dominique Martinet wrote:
> Rob Landley wrote on Wed, Sep 02, 2015 at 10:40:27AM -0500:
>> https://www.kernel.org/doc/Documentation/unaligned-memory-access.txt
> 
> Thanks, I vaguely remembered something like that.
> 
>>> What do you mean by "sparse's tag checking" ?
>>
>> https://sparse.wiki.kernel.org/index.php/Main_Page
>>
>> I'm assuming the __le16 thing is a sparse tag.
> 
> Ok, I was thinking of sparse as in sparse file e.g. not-contigous and it
> didn't make much sense.
> I'm still not quite sure why the __le16 would be a sparse tag?

This variable should hold little endian 16 bit data, this variable
should hold big endian 16 bit data.

I separate the copy and the conversion into two steps to avoid the
unaligned access, meaning I use the same variable for both and convert
it in place. It's valid, but may confuse the static typechecker the
kernel guys use.

> As you
> correctly state underneath it's an endianess conversion function?
> I checked out the sparse tree and can't seem to understand how it
> relates.

Sparse confirms that labeled variables only take data from or send data
to variables with the same label, except through labeled conversion
functions. So pointers to userspace can be labeled _user and it warns
about dereferencing them without going through copy_to_user() and such.

>>> For what it's worth, rdma does a p9_parse_header, which does a
>>> p9pdu_readf with appropriate arguments. That sounds cleaner to me, and
>>> it'll also take care of the size read just above as le32.
>>
>> I'm happy to have it fixed a different way. The problemm is the +5 to
>> the address, that's 1 byte alignment so you can only read bytes from
>> that on a processor that doesn't internally do alignment fixups.
> 
> p9_parse_header will do a memcpy ultimately, so it should be safe to
> use.

If it does a memcpy to another unaligned variable, that doesn't help.

If this line is executed, it does an unaligned read. It was found
because it faulted for a coworker, so I'm guessing it gets executed.

> I'll submit something tomorrow if you don't want/aren't comfortable
> doing it.

Just juggling 12 things at the moment, and the people on this list are
more familiar with this code than I am anyway. :)

Rob

------------------------------------------------------------------------------
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140
Latchesar Ionkov Sept. 4, 2015, 2:51 p.m. UTC | #5
If unaligned memory access is bad, why not replace the line with:

tag = m->rbuf[5] + (m->rbuf[6]<<8);

It is definitely more readable, and probably faster too.

Thanks,
    Lucho

On Wed, Sep 2, 2015 at 5:52 PM, Rob Landley <rob@landley.net> wrote:

> On 09/02/2015 11:00 AM, Dominique Martinet wrote:
> > Rob Landley wrote on Wed, Sep 02, 2015 at 10:40:27AM -0500:
> >> https://www.kernel.org/doc/Documentation/unaligned-memory-access.txt
> >
> > Thanks, I vaguely remembered something like that.
> >
> >>> What do you mean by "sparse's tag checking" ?
> >>
> >> https://sparse.wiki.kernel.org/index.php/Main_Page
> >>
> >> I'm assuming the __le16 thing is a sparse tag.
> >
> > Ok, I was thinking of sparse as in sparse file e.g. not-contigous and it
> > didn't make much sense.
> > I'm still not quite sure why the __le16 would be a sparse tag?
>
> This variable should hold little endian 16 bit data, this variable
> should hold big endian 16 bit data.
>
> I separate the copy and the conversion into two steps to avoid the
> unaligned access, meaning I use the same variable for both and convert
> it in place. It's valid, but may confuse the static typechecker the
> kernel guys use.
>
> > As you
> > correctly state underneath it's an endianess conversion function?
> > I checked out the sparse tree and can't seem to understand how it
> > relates.
>
> Sparse confirms that labeled variables only take data from or send data
> to variables with the same label, except through labeled conversion
> functions. So pointers to userspace can be labeled _user and it warns
> about dereferencing them without going through copy_to_user() and such.
>
> >>> For what it's worth, rdma does a p9_parse_header, which does a
> >>> p9pdu_readf with appropriate arguments. That sounds cleaner to me, and
> >>> it'll also take care of the size read just above as le32.
> >>
> >> I'm happy to have it fixed a different way. The problemm is the +5 to
> >> the address, that's 1 byte alignment so you can only read bytes from
> >> that on a processor that doesn't internally do alignment fixups.
> >
> > p9_parse_header will do a memcpy ultimately, so it should be safe to
> > use.
>
> If it does a memcpy to another unaligned variable, that doesn't help.
>
> If this line is executed, it does an unaligned read. It was found
> because it faulted for a coworker, so I'm guessing it gets executed.
>
> > I'll submit something tomorrow if you don't want/aren't comfortable
> > doing it.
>
> Just juggling 12 things at the moment, and the people on this list are
> more familiar with this code than I am anyway. :)
>
> Rob
>
>
> ------------------------------------------------------------------------------
> Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
> Get real-time metrics from all of your servers, apps and tools
> in one place.
> SourceForge users - Click here to start your Free Trial of Datadog now!
> http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140
> _______________________________________________
> V9fs-developer mailing list
> V9fs-developer@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/v9fs-developer
>
------------------------------------------------------------------------------
Charles Forsyth Sept. 4, 2015, 3:12 p.m. UTC | #6
On 2 September 2015 at 16:40, Rob Landley <rob@landley.net> wrote:

> BTW I've checked and we do encode in le16 when we send, but I'm pretty
> > sure the protocol specifies native endianness


The protocol specifies the byte order of values on the wire:
"Each message consists of a sequence of bytes.  Two-, four-,
          and eight-byte fields hold unsigned integers represented in
          little-endian order (least significant byte first).  Data
          items of larger or variable lengths are represented by a
          two-byte field specifying a count, n, followed by n bytes of
          data. ..."
------------------------------------------------------------------------------
Charles Forsyth Sept. 4, 2015, 3:18 p.m. UTC | #7
On 4 September 2015 at 16:12, Charles Forsyth <charles.forsyth@gmail.com>
wrote:

> > sure the protocol specifies native endianness
>
>
> The protocol specifies the byte order of values on the wire:


Gmail suddenly sent that on its own before I'd finished.

The order happens to be the same as x86, but the original machines using it
were big-endian,
and a little-endian format was specified to make certain that all the code
didn't anywhere assume "native" order
(there being no right definition of "native" in a distributed system of
heterogeneous machines).
Plan 9's own implementation simply assemble the values byte by byte, much
as lucho suggested,
in plain C and avoids alignment and other problems associated with casting
and punning.
------------------------------------------------------------------------------
Rob Landley Sept. 4, 2015, 10:39 p.m. UTC | #8
On 09/04/2015 09:51 AM, Latchesar Ionkov wrote:
> If unaligned memory access is bad, why not replace the line with:
>
> tag = m->rbuf[5] + (m->rbuf[6]<<8);

char *rbuf can be signed, so you may need a typecast in there...?

> It is definitely more readable, and probably faster too.
>
> Thanks,
>     Lucho

I'm all for it, however Dominique seems to want to pull this into a
larger change for unrelated reasons.

I'm not currently in a position to test it because the patch was
originally against a 3.4 kernel, and although the code needing patching
hasn't changed since I haven't forward ported the (rather ugly) ethernet
driver out of the 3.x series yet (the numato boards
http://nommu.org/jcore is using don't have ethernet hardware so it was a
bit down the todo list). I need to forward port the ethernet driver to
have a 4.2 test system to test the larger change against.

All I can say at the moment is the old change fixed it on the old
kernel, and the code being patched was the same in the new kernel. Maybe
it would be easier to try to get diod running on the sh2 system and
loopback mount a filesystem? No idea if it's nommu-friendly...

Rob

------------------------------------------------------------------------------
Dominique Martinet Sept. 5, 2015, 6:27 a.m. UTC | #9
Rob Landley wrote on Fri, Sep 04, 2015:
> I'm all for it, however Dominique seems to want to pull this into a
> larger change for unrelated reasons.

Happy to take whichever.
I hadn't looked at the trans_fd code in age and thought it'd be more
reliable to use the same parse header function that we already have, but
it ended up much bigger than I originally thought it'd be (even if most
of it is just renaming the variables in the read worker to have a pdu)

Posted it anyway since I had it working, but as I said in the patch
itself happy to have anything else taken instead - memcpy or reading one
byte at a time both are good with me (assuming there's no problem with
sign yeah)
Eric Van Hensbergen Sept. 6, 2015, 2:59 a.m. UTC | #10
While I always liked the elegance of the byte-by-byte assembly like in
the original plan 9 code, we do already have the parse-header code
which is used for the other transports, so it does seem to be the more
sensible path.  I'm going to pull this into for-next.  If someone
feels really strongly against it (or perhaps submit a patch switching
pdu_read to use byte-by-byte assembly) I'm happy to continue the
discussion.

       -eric


On Sat, Sep 5, 2015 at 1:27 AM, Dominique Martinet
<asmadeus@codewreck.org> wrote:
> Rob Landley wrote on Fri, Sep 04, 2015:
>> I'm all for it, however Dominique seems to want to pull this into a
>> larger change for unrelated reasons.
>
> Happy to take whichever.
> I hadn't looked at the trans_fd code in age and thought it'd be more
> reliable to use the same parse header function that we already have, but
> it ended up much bigger than I originally thought it'd be (even if most
> of it is just renaming the variables in the read worker to have a pdu)
>
> Posted it anyway since I had it working, but as I said in the patch
> itself happy to have anything else taken instead - memcpy or reading one
> byte at a time both are good with me (assuming there's no problem with
> sign yeah)
>
> --
> Dominique
>
> ------------------------------------------------------------------------------
> _______________________________________________
> V9fs-developer mailing list
> V9fs-developer@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/v9fs-developer

------------------------------------------------------------------------------
diff mbox

Patch

diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index bced8c0..09ec72c 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -340,7 +340,8 @@  static void p9_read_work(struct work_struct *work)
 			goto error;
 		}
 
-		tag = le16_to_cpu(*(__le16 *) (m->rbuf+5)); /* read tag */
+		memcpy(tag, m->rbuf+5, sizeof(tag)); /* read tag, unaligned */
+		tag = le16_to_cpu(*(__le16 *)tag);
 		p9_debug(P9_DEBUG_TRANS,
 			 "mux %p pkt: size: %d bytes tag: %d\n", m, n, tag);