diff mbox

IB/core: Fix unaligned accesses

Message ID 1430340983-12538-1-git-send-email-david.ahern@oracle.com (mailing list archive)
State Superseded
Headers show

Commit Message

David Ahern April 29, 2015, 8:56 p.m. UTC
Addresses the following kernel logs seen during boot of sparc systems:

Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]

Signed-off-by: David Ahern <david.ahern@oracle.com>
---
 drivers/infiniband/core/cm.c |   18 +++++++++++++++---
 1 files changed, 15 insertions(+), 3 deletions(-)

Comments

Jason Gunthorpe April 29, 2015, 9:18 p.m. UTC | #1
On Wed, Apr 29, 2015 at 04:56:23PM -0400, David Ahern wrote:

>  static void cm_mask_copy(u8 *dst, u8 *src, u8 *mask)
>  {

Do you know what call site is unaligned? A quick audit suggests that
fixing a few possibly troublesome callers to be guaranteed 32 bit
aligned should be fairly straight foward.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Ahern April 29, 2015, 9:24 p.m. UTC | #2
On 4/29/15 3:18 PM, Jason Gunthorpe wrote:
> On Wed, Apr 29, 2015 at 04:56:23PM -0400, David Ahern wrote:
>
>>   static void cm_mask_copy(u8 *dst, u8 *src, u8 *mask)
>>   {
>
> Do you know what call site is unaligned? A quick audit suggests that
> fixing a few possibly troublesome callers to be guaranteed 32 bit
> aligned should be fairly straight foward.


I believe this case is

cm_find_listen() -> cm_compare_private_data() -> cm_mask_copy()
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe April 29, 2015, 9:30 p.m. UTC | #3
On Wed, Apr 29, 2015 at 03:24:19PM -0600, David Ahern wrote:
> On 4/29/15 3:18 PM, Jason Gunthorpe wrote:
> >On Wed, Apr 29, 2015 at 04:56:23PM -0400, David Ahern wrote:
> >
> >>  static void cm_mask_copy(u8 *dst, u8 *src, u8 *mask)
> >>  {
> >
> >Do you know what call site is unaligned? A quick audit suggests that
> >fixing a few possibly troublesome callers to be guaranteed 32 bit
> >aligned should be fairly straight foward.
> 
> 
> I believe this case is
> 
> cm_find_listen() -> cm_compare_private_data() -> cm_mask_copy()

Right, I'm pretty sure that is the main place that would have the 32
bit alignment limitation since it is a on the wire message.

I suggest changing the signature to:

cm_mask_copy(u32 *dst, const u32 *src, const u32 *mask)

And dealing with the fairly few resulting changes..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Ahern April 29, 2015, 9:38 p.m. UTC | #4
On 4/29/15 3:30 PM, Jason Gunthorpe wrote:
> On Wed, Apr 29, 2015 at 03:24:19PM -0600, David Ahern wrote:
>> On 4/29/15 3:18 PM, Jason Gunthorpe wrote:
>>> On Wed, Apr 29, 2015 at 04:56:23PM -0400, David Ahern wrote:
>>>
>>>>   static void cm_mask_copy(u8 *dst, u8 *src, u8 *mask)
>>>>   {
>>>
>>> Do you know what call site is unaligned? A quick audit suggests that
>>> fixing a few possibly troublesome callers to be guaranteed 32 bit
>>> aligned should be fairly straight foward.
>>
>>
>> I believe this case is
>>
>> cm_find_listen() -> cm_compare_private_data() -> cm_mask_copy()
>
> Right, I'm pretty sure that is the main place that would have the 32
> bit alignment limitation since it is a on the wire message.
>
> I suggest changing the signature to:
>
> cm_mask_copy(u32 *dst, const u32 *src, const u32 *mask)
>
> And dealing with the fairly few resulting changes..

Confused. That does not deal with the alignment problem. Internal to 
cm_mask_copy unsigned longs are used (8-bytes), so why change the 
signature to u32?


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe April 29, 2015, 9:51 p.m. UTC | #5
On Wed, Apr 29, 2015 at 03:38:22PM -0600, David Ahern wrote:

> >And dealing with the fairly few resulting changes..
> 
> Confused. That does not deal with the alignment problem. Internal to
> cm_mask_copy unsigned longs are used (8-bytes), so why change the
> signature to u32?

You'd change the loop stride to by u32 as well.

This whole thing is just an attempted optimization, but doing copy and
mask 8 bytes at a time on unaligned data is not very efficient, even
on x86.

So either drop the optimization and use u8 as the stride.

Or keep the optimization and guarentee alignment, the best we can do
is u32.

Since this is an optimization, get_unaligned should be avoided,
looping over u8 would be faster.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Ahern April 29, 2015, 9:59 p.m. UTC | #6
On 4/29/15 3:51 PM, Jason Gunthorpe wrote:
> On Wed, Apr 29, 2015 at 03:38:22PM -0600, David Ahern wrote:
>
>>> And dealing with the fairly few resulting changes..
>>
>> Confused. That does not deal with the alignment problem. Internal to
>> cm_mask_copy unsigned longs are used (8-bytes), so why change the
>> signature to u32?
>
> You'd change the loop stride to by u32 as well.
>
> This whole thing is just an attempted optimization, but doing copy and
> mask 8 bytes at a time on unaligned data is not very efficient, even
> on x86.
>
> So either drop the optimization and use u8 as the stride.
>
> Or keep the optimization and guarentee alignment, the best we can do
> is u32.
>
> Since this is an optimization, get_unaligned should be avoided,
> looping over u8 would be faster.

Sorry, to be dense on this but I still don't see how your proposal 
addresses the underlying problem -- guarantee of 8 byte alignment. Be it 
a u8 or u32 for the input arguments the cast to unsigned long says that 
an extended load operation can be used for access. For that to work the 
address (src, dst, mask) must all be 8-byte aligned.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe April 29, 2015, 10:15 p.m. UTC | #7
On Wed, Apr 29, 2015 at 03:59:20PM -0600, David Ahern wrote:
> On 4/29/15 3:51 PM, Jason Gunthorpe wrote:
> >On Wed, Apr 29, 2015 at 03:38:22PM -0600, David Ahern wrote:
> >
> >>>And dealing with the fairly few resulting changes..
> >>
> >>Confused. That does not deal with the alignment problem. Internal to
> >>cm_mask_copy unsigned longs are used (8-bytes), so why change the
> >>signature to u32?
> >
> >You'd change the loop stride to by u32 as well.
> >
> >This whole thing is just an attempted optimization, but doing copy and
> >mask 8 bytes at a time on unaligned data is not very efficient, even
> >on x86.
> >
> >So either drop the optimization and use u8 as the stride.
> >
> >Or keep the optimization and guarentee alignment, the best we can do
> >is u32.
> >
> >Since this is an optimization, get_unaligned should be avoided,
> >looping over u8 would be faster.
> 
> Sorry, to be dense on this but I still don't see how your proposal
> addresses the underlying problem -- guarantee of 8 byte alignment.
> Be it a u8 or u32 for the input arguments the cast to unsigned long
> says that an extended load operation can be used for access. For
> that to work the address (src, dst, mask) must all be 8-byte
> aligned.

Read carefully:

> >You'd change the loop stride to by u32 as well.

which means this:

static void cm_mask_copy(u32 *dst, const u32 *src, const u32 *mask)
{
	int i;

	for (i = 0; i < IB_CM_COMPARE_SIZE; i++)
		dst[i] = src[i] & mask[i];
}

Divide IB_CM_COMPARE_SIZE by 4, adjust all users.

Make other related type changes the compiler tells you about.

Do not add casts.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Ahern April 29, 2015, 10:34 p.m. UTC | #8
On 4/29/15 4:15 PM, Jason Gunthorpe wrote:
> Read carefully:
>
>>> > >You'd change the loop stride to by u32 as well.
> which means this:
>
> static void cm_mask_copy(u32 *dst, const u32 *src, const u32 *mask)
> {
> 	int i;
>
> 	for (i = 0; i < IB_CM_COMPARE_SIZE; i++)
> 		dst[i] = src[i] & mask[i];
> }
>
> Divide IB_CM_COMPARE_SIZE by 4, adjust all users.
>
> Make other related type changes the compiler tells you about.
>
> Do not add casts.

got it. that was a lot clearer.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index e28a494..5102cfe 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -439,11 +439,23 @@  static struct cm_id_private * cm_acquire_id(__be32 local_id, __be32 remote_id)
 
 static void cm_mask_copy(u8 *dst, u8 *src, u8 *mask)
 {
+	unsigned long *plsrc, *plmask, *pldst;
 	int i;
 
-	for (i = 0; i < IB_CM_COMPARE_SIZE / sizeof(unsigned long); i++)
-		((unsigned long *) dst)[i] = ((unsigned long *) src)[i] &
-					     ((unsigned long *) mask)[i];
+	/* unsigned longs can use extended load operations which
+	 * can require 8-byte alignments. dst, src and mask are
+	 * not guaranteed to be aligned.
+	 */
+	pldst = (unsigned long *) dst;
+	plsrc = (unsigned long *) src;
+	plmask = (unsigned long *) mask;
+	for (i = 0; i < IB_CM_COMPARE_SIZE / sizeof(unsigned long); i++) {
+		unsigned long lsrc, lmask;
+
+		lsrc = get_unaligned(&plsrc[i]);
+		lmask = get_unaligned(&plmask[i]);
+		put_unaligned(lsrc & lmask, &pldst[i]);
+	}
 }
 
 static int cm_compare_data(struct ib_cm_compare_data *src_data,