Message ID | 80200c53ae54f6cb34bd6fb51e9da65fdcc03004.1461073602.git.jpoimboe@redhat.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
Current kernel (4.6.0-rc4+) + GCC 5.3.0 definitely truncated qla2x00_get_host_fabric_name() routine. Just like Josh indicated, we’re dropping down to the next routine. root@mars:/sys/class/fc_host/host3 2016-04-22 16:07:30 > cat fabric_name Killed —— static void qla2x00_get_host_fabric_name(struct Scsi_Host *shost) { 32d0: e8 00 00 00 00 callq 32d5 <qla2x00_get_host_fabric_name+0x5> 32d5: 55 push %rbp 32d6: 48 89 e5 mov %rsp,%rbp 32d9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 00000000000032e0 <qla2x00_get_starget_node_name>: qla2x00_get_starget_node_name(): /root/qt/linux.git/drivers/scsi/qla2xxx/qla_attr.c:1756 fc_host_port_type(shost) = port_type; } ---- Apr 22 16:07:50 mars kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 Apr 22 16:07:50 mars kernel: IP: [<ffffffff813f72d7>] scsi_is_host_device+0x7/0x20 Apr 22 16:07:50 mars kernel: PGD 7fe1c8067 PUD 7f5c72067 PMD 0 Apr 22 16:07:50 mars kernel: Oops: 0000 [#1] SMP Apr 22 16:07:50 mars kernel: Modules linked in: qla2xxx scsi_transport_fc ebtable_nat ebtables ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 ... dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) [last unloaded: qla2xxx] Apr 22 16:07:50 mars kernel: CPU: 8 PID: 10452 Comm: cat Tainted: G E 4.6.0-rc4+ #2 Apr 22 16:07:50 mars kernel: Hardware name: HP ProLiant DL380 G7, BIOS P67 05/05/2011 Apr 22 16:07:50 mars kernel: task: ffff8807fcd1a880 ti: ffff8807ff128000 task.ti: ffff8807ff128000 Apr 22 16:07:50 mars kernel: RIP: 0010:[<ffffffff813f72d7>] [<ffffffff813f72d7>] scsi_is_host_device+0x7/0x20 Apr 22 16:07:50 mars kernel: RSP: 0018:ffff8807ff12bcf0 EFLAGS: 00010246 Apr 22 16:07:50 mars kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff880ffe8ade88 Apr 22 16:07:50 mars kernel: RDX: ffff8807f5db9000 RSI: ffff880ffed43340 RDI: 0000000000000000 Apr 22 16:07:50 mars kernel: RBP: ffff8807ff12bd08 R08: ffff88101f45ac38 R09: ffff8807fccef280 Apr 22 16:07:50 mars kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff88101b7e6000 Apr 22 16:07:50 mars kernel: R13: ffff8807fdaf1f00 R14: ffff8800dad379c0 R15: 0000000000000001 Apr 22 16:07:50 mars kernel: FS: 00007fd569512700(0000) GS:ffff88081fc80000(0000) knlGS:0000000000000000 Apr 22 16:07:50 mars kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 22 16:07:50 mars kernel: CR2: 0000000000000058 CR3: 00000007f23f4000 CR4: 00000000000006e0 Apr 22 16:07:50 mars kernel: Stack: Apr 22 16:07:50 mars kernel: ffffffffa0759db5 ffff88101b7e6000 ffff8807f5db9000 ffff8807ff12bd10 Apr 22 16:07:50 mars kernel: ffff8807ff12bd30 ffffffffa065b1bb ffff880ffed43340 ffffffff8166d950 Apr 22 16:07:50 mars kernel: ffff8807ff12bd50 ffffffff813c9e30 ffffffff815ccfb2 ffff8800dad379c0 Apr 22 16:07:50 mars kernel: Call Trace: Apr 22 16:07:50 mars kernel: [<ffffffffa0759db5>] ? qla2x00_get_starget_node_name+0x25/0x90 [qla2xxx] Apr 22 16:07:50 mars kernel: [<ffffffffa065b1bb>] ? show_fc_host_fabric_name+0x4b/0x80 [scsi_transport_fc] Apr 22 16:07:50 mars kernel: [<ffffffff813c9e30>] ? dev_attr_show+0x20/0x50 Regards, Quinn Tran -----Original Message----- From: <linux-scsi-owner@vger.kernel.org> on behalf of Josh Poimboeuf <jpoimboe@redhat.com> Date: Tuesday, April 19, 2016 at 6:56 AM To: James Bottomley <James.Bottomley@HansenPartnership.com>, "Martin K. Petersen" <martin.petersen@oracle.com> Cc: linux-scsi <linux-scsi@vger.kernel.org>, linux-kernel <linux-kernel@vger.kernel.org>, Denys Vlasenko <dvlasenk@redhat.com>, Thomas Graf <tgraf@suug.ch>, Peter Zijlstra <peterz@infradead.org>, David Rientjes <rientjes@google.com>, Andrew Morton <akpm@linux-foundation.org>, Arnd Bergmann <arnd@arndb.de>, "jamborm@gcc.gnu.org" <jamborm@gcc.gnu.org>, Ingo Molnar <mingo@kernel.org>, Himanshu Madhani <himanshu.madhani@qlogic.com>, Dept-Eng QLA2xxx Upstream <qla2xxx-upstream@qlogic.com> Subject: [PATCH] scsi: fc: force inlining of wwn conversion functions >objtool reports [1] the following warning: > > drivers/scsi/qla2xxx/qla_attr.o: warning: objtool: qla2x00_get_host_fabric_name() falls through to next function qla2x00_get_starget_port_name() > >This warning is due to a gcc bug [2] which causes corrupt code: > > 0000000000002f53 <qla2x00_get_host_fabric_name>: > 2f53: 55 push %rbp > 2f54: 48 89 e5 mov %rsp,%rbp > > 0000000000002f57 <qla2x00_get_fc_host_stats>: > 2f57: 55 push %rbp > 2f58: b9 e8 00 00 00 mov $0xe8,%ecx > 2f5d: 48 89 e5 mov %rsp,%rbp > ... > >Note that qla2x00_get_host_fabric_name() is inexplicably truncated after >setting up the frame pointer. It falls through to the next function, >which is very bad. > >It occurs with the combination of the following two recent commits: > > bc27fb68aaad ("include/uapi/linux/byteorder, swab: force inlining of some byteswap operations") > ef3fb2422ffe ("scsi: fc: use get/put_unaligned64 for wwn access") > >The call chain which appears to trigger the problem is: > > qla2x00_get_host_fabric_name() > wwn_to_u64() > get_unaligned_be64() > be64_to_cpup() > __be64_to_cpup() > >The bug requires very specific conditions to trigger. According to Martin >Jambor (from the gcc bugzilla): > > "This bug can occur when an inlineable function containing a call to > __builtin_constant_p, which checks a parameter or a value it > references and a (possibly indirect) caller of the function actually > passes a constant, but stores it using a type of a different size." > >There's no reliable way to avoid (or even detect) the bug. Until it >gets fixed in released versions of gcc, the least intrusive workaround >for this particular issue is to force the wwn conversion functions to be >inlined. > >[1] https://lists.01.org/pipermail/kbuild-all/2016-April/019579.html >[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646 > >Reported-by: kbuild test robot <fengguang.wu@intel.com> >Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> >--- > include/scsi/scsi_transport_fc.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > >diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h >index bf66ea6..1919cd4 100644 >--- a/include/scsi/scsi_transport_fc.h >+++ b/include/scsi/scsi_transport_fc.h >@@ -796,12 +796,12 @@ fc_remote_port_chkready(struct fc_rport *rport) > return result; > } > >-static inline u64 wwn_to_u64(u8 *wwn) >+static __always_inline u64 wwn_to_u64(u8 *wwn) > { > return get_unaligned_be64(wwn); > } > >-static inline void u64_to_wwn(u64 inm, u8 *wwn) >+static __always_inline void u64_to_wwn(u64 inm, u8 *wwn) > { > put_unaligned_be64(inm, wwn); > } >-- >2.4.11 > >-- >To unsubscribe from this list: send the line "unsubscribe linux-scsi" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html
James, Can you merge this patch for 4.6? On Tue, Apr 19, 2016 at 08:56:00AM -0500, Josh Poimboeuf wrote: > objtool reports [1] the following warning: > > drivers/scsi/qla2xxx/qla_attr.o: warning: objtool: qla2x00_get_host_fabric_name() falls through to next function qla2x00_get_starget_port_name() > > This warning is due to a gcc bug [2] which causes corrupt code: > > 0000000000002f53 <qla2x00_get_host_fabric_name>: > 2f53: 55 push %rbp > 2f54: 48 89 e5 mov %rsp,%rbp > > 0000000000002f57 <qla2x00_get_fc_host_stats>: > 2f57: 55 push %rbp > 2f58: b9 e8 00 00 00 mov $0xe8,%ecx > 2f5d: 48 89 e5 mov %rsp,%rbp > ... > > Note that qla2x00_get_host_fabric_name() is inexplicably truncated after > setting up the frame pointer. It falls through to the next function, > which is very bad. > > It occurs with the combination of the following two recent commits: > > bc27fb68aaad ("include/uapi/linux/byteorder, swab: force inlining of some byteswap operations") > ef3fb2422ffe ("scsi: fc: use get/put_unaligned64 for wwn access") > > The call chain which appears to trigger the problem is: > > qla2x00_get_host_fabric_name() > wwn_to_u64() > get_unaligned_be64() > be64_to_cpup() > __be64_to_cpup() > > The bug requires very specific conditions to trigger. According to Martin > Jambor (from the gcc bugzilla): > > "This bug can occur when an inlineable function containing a call to > __builtin_constant_p, which checks a parameter or a value it > references and a (possibly indirect) caller of the function actually > passes a constant, but stores it using a type of a different size." > > There's no reliable way to avoid (or even detect) the bug. Until it > gets fixed in released versions of gcc, the least intrusive workaround > for this particular issue is to force the wwn conversion functions to be > inlined. > > [1] https://lists.01.org/pipermail/kbuild-all/2016-April/019579.html > [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646 > > Reported-by: kbuild test robot <fengguang.wu@intel.com> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > --- > include/scsi/scsi_transport_fc.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h > index bf66ea6..1919cd4 100644 > --- a/include/scsi/scsi_transport_fc.h > +++ b/include/scsi/scsi_transport_fc.h > @@ -796,12 +796,12 @@ fc_remote_port_chkready(struct fc_rport *rport) > return result; > } > > -static inline u64 wwn_to_u64(u8 *wwn) > +static __always_inline u64 wwn_to_u64(u8 *wwn) > { > return get_unaligned_be64(wwn); > } > > -static inline void u64_to_wwn(u64 inm, u8 *wwn) > +static __always_inline void u64_to_wwn(u64 inm, u8 *wwn) > { > put_unaligned_be64(inm, wwn); > } > -- > 2.4.11 >
>>>>> "Josh" == Josh Poimboeuf <jpoimboe@redhat.com> writes:
Josh> Can you merge this patch for 4.6?
I am really not a big fan of working around compiler bugs in a device
driver.
Are we sure there are no other get_unaligned_be64() calls in the kernel
that suffer the same fate?
On Mon, 2016-04-25 at 22:40 -0400, Martin K. Petersen wrote: > > > > > > "Josh" == Josh Poimboeuf <jpoimboe@redhat.com> writes: > > Josh> Can you merge this patch for 4.6? > > I am really not a big fan of working around compiler bugs in a device > driver. Me neither > Are we sure there are no other get_unaligned_be64() calls in the > kernel that suffer the same fate? Agree, plus, as I've said before, we have 3-4 weeks before we go final, so we still have some time before a decision has to be made. It looks like the gcc people already have a patch for the compiler, so the distributions could just push that out through channels. James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Monday 25 April 2016 20:37:31 James Bottomley wrote: > On Mon, 2016-04-25 at 22:40 -0400, Martin K. Petersen wrote: > > > > > > > "Josh" == Josh Poimboeuf <jpoimboe@redhat.com> writes: > > > > Josh> Can you merge this patch for 4.6? > > > > I am really not a big fan of working around compiler bugs in a device > > driver. > > Me neither > > > Are we sure there are no other get_unaligned_be64() calls in the > > kernel that suffer the same fate? > > Agree, plus, as I've said before, we have 3-4 weeks before we go final, > so we still have some time before a decision has to be made. It looks > like the gcc people already have a patch for the compiler, so the > distributions could just push that out through channels. I don't think we can realistically blacklist gcc-4.9.{0,1,2,3}, gcc-5.{0,1,2,3}.* and gcc-6.0 and require everyone to upgrade to compilers that have not been released yet in order to build a linux-4.6 kernel. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 26, 2016 at 09:22:46AM +0200, Arnd Bergmann wrote: > > Agree, plus, as I've said before, we have 3-4 weeks before we go final, > > so we still have some time before a decision has to be made. It looks > > like the gcc people already have a patch for the compiler, so the > > distributions could just push that out through channels. > > I don't think we can realistically blacklist gcc-4.9.{0,1,2,3}, > gcc-5.{0,1,2,3}.* and gcc-6.0 and require everyone to upgrade to compilers > that have not been released yet in order to build a linux-4.6 kernel. Agreed. What about just removing the wrappers? They seem fairly pointless to start with. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>> "Arnd" == Arnd Bergmann <arnd@arndb.de> writes:
Arnd> I don't think we can realistically blacklist gcc-4.9.{0,1,2,3},
Arnd> gcc-5.{0,1,2,3}.* and gcc-6.0 and require everyone to upgrade to
Arnd> compilers that have not been released yet in order to build a
Arnd> linux-4.6 kernel.
I agree that compiler blacklisting is problematic and I'd like to avoid
it. The question is how far we go in the kernel to accommodate various
levels of brokenness.
In any case. Sticking compiler workarounds in device driver code is akin
to putting demolition orders on display on Alpha Centauri. At the very
minimum the patch should put a fat comment in the code stating that
these wrapper functions or #defines should not be changed in the future
because that'll break builds using gcc XYZ. But that does not solve the
problem for anybody else that might be doing something similar.
Converting between u64 and $RANDOM_TYPE in an inline wrapper does not
seem like a rare and unusual programming pattern.
On Tuesday 26 April 2016 09:06:54 Martin K. Petersen wrote: > >>>>> "Arnd" == Arnd Bergmann <arnd@arndb.de> writes: > > Arnd> I don't think we can realistically blacklist gcc-4.9.{0,1,2,3}, > Arnd> gcc-5.{0,1,2,3}.* and gcc-6.0 and require everyone to upgrade to > Arnd> compilers that have not been released yet in order to build a > Arnd> linux-4.6 kernel. > > I agree that compiler blacklisting is problematic and I'd like to avoid > it. The question is how far we go in the kernel to accommodate various > levels of brokenness. > > In any case. Sticking compiler workarounds in device driver code is akin > to putting demolition orders on display on Alpha Centauri. At the very > minimum the patch should put a fat comment in the code stating that > these wrapper functions or #defines should not be changed in the future > because that'll break builds using gcc XYZ. But that does not solve the > problem for anybody else that might be doing something similar. > Converting between u64 and $RANDOM_TYPE in an inline wrapper does not > seem like a rare and unusual programming pattern. It's not the driver really, it's the core scsi/fc layer, which makes it a little dangerous that a random driver. I agree that putting a comment in would also help. What I understand from the bug report is that to trigger this bug you need these elements: 1. an inline function marked __always_inline 2. another inline function that is automatically inlined (not __always_inline) 3. CONFIG_OPTIMIZE_INLINING=y to guarantee 2 4. __builtin_compatible_p inside that inline function The last point is what Denys introduced in the kernel with bc27fb68aaad ("include/uapi/linux/byteorder, swab: force inlining of some byteswap operations"). So maybe it's better after all to revert that patch, to have a higher confidence in the same bug not appearing elsewhere. It's also really a workaround for another quirk of the compiler, but that one only results in duplicated functions in object code rather than functions that end in the middle. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2016-04-26 at 17:58 +0200, Arnd Bergmann wrote: > On Tuesday 26 April 2016 09:06:54 Martin K. Petersen wrote: > > > > > > > "Arnd" == Arnd Bergmann <arnd@arndb.de> writes: > > > > Arnd> I don't think we can realistically blacklist gcc > > -4.9.{0,1,2,3}, > > Arnd> gcc-5.{0,1,2,3}.* and gcc-6.0 and require everyone to upgrade > > to > > Arnd> compilers that have not been released yet in order to build a > > Arnd> linux-4.6 kernel. > > > > I agree that compiler blacklisting is problematic and I'd like to > > avoid it. The question is how far we go in the kernel to > > accommodate various levels of brokenness. > > > > In any case. Sticking compiler workarounds in device driver code is > > akin to putting demolition orders on display on Alpha Centauri. At > > the very minimum the patch should put a fat comment in the code > > stating that these wrapper functions or #defines should not be > > changed in the future because that'll break builds using gcc XYZ. > > But that does not solve the problem for anybody else that might be > > doing something similar. Converting between u64 and $RANDOM_TYPE in > > an inline wrapper does not seem like a rare and unusual programming > > pattern. > > It's not the driver really, it's the core scsi/fc layer, which makes > it a little dangerous that a random driver. Well, it's libfc; that's a fibre channel transport class mostly used by FCoE drivers ... there's few enough of those to call it driver only. > I agree that putting a comment in would also help. What I understand > from the bug report is that to trigger this bug you need these > elements: > > 1. an inline function marked __always_inline > 2. another inline function that is automatically inlined (not > __always_inline) > 3. CONFIG_OPTIMIZE_INLINING=y to guarantee 2 > 4. __builtin_compatible_p inside that inline function > > The last point is what Denys introduced in the kernel with > bc27fb68aaad ("include/uapi/linux/byteorder, swab: force inlining of > some byteswap operations"). So maybe it's better after all to revert > that patch, to have a higher confidence in the same bug not appearing > elsewhere. It's also really a workaround for another quirk of the > compiler, but that one only results in duplicated functions in object > code rather than functions that end in the middle. Yes, I think this is my preferred option. That patch is nothing more than an attempt to force the compiler to do something it didn't do but should have. If we apply the general rule that we shouldn't work around compiler problems in the kernel code, then that should have been disallowed first. Plus, as the root cause of all of this, reverting that patch will ensure that nothing else picks up this problem (at least from the route we got it). James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>> "James" == James Bottomley <James.Bottomley@HansenPartnership.com> writes: >> The last point is what Denys introduced in the kernel with >> bc27fb68aaad ("include/uapi/linux/byteorder, swab: force inlining of >> some byteswap operations"). So maybe it's better after all to revert >> that patch, to have a higher confidence in the same bug not appearing >> elsewhere. It's also really a workaround for another quirk of the >> compiler, but that one only results in duplicated functions in object >> code rather than functions that end in the middle. James> Yes, I think this is my preferred option. Same here.
Hi, On Tue, Apr 26, 2016 at 05:58:20PM +0200, Arnd Bergmann wrote: > On Tuesday 26 April 2016 09:06:54 Martin K. Petersen wrote: > > >>>>> "Arnd" == Arnd Bergmann <arnd@arndb.de> writes: > > > > Arnd> I don't think we can realistically blacklist gcc-4.9.{0,1,2,3}, > > Arnd> gcc-5.{0,1,2,3}.* and gcc-6.0 and require everyone to upgrade to > > Arnd> compilers that have not been released yet in order to build a > > Arnd> linux-4.6 kernel. > > > > I agree that compiler blacklisting is problematic and I'd like to avoid > > it. The question is how far we go in the kernel to accommodate various > > levels of brokenness. > > > > In any case. Sticking compiler workarounds in device driver code is akin > > to putting demolition orders on display on Alpha Centauri. At the very > > minimum the patch should put a fat comment in the code stating that > > these wrapper functions or #defines should not be changed in the future > > because that'll break builds using gcc XYZ. But that does not solve the > > problem for anybody else that might be doing something similar. > > Converting between u64 and $RANDOM_TYPE in an inline wrapper does not > > seem like a rare and unusual programming pattern. > > It's not the driver really, it's the core scsi/fc layer, which makes > it a little dangerous that a random driver. > > I agree that putting a comment in would also help. What I understand > from the bug report is that to trigger this bug you need these elements: > > 1. an inline function marked __always_inline > 2. another inline function that is automatically inlined (not __always_inline) > 3. CONFIG_OPTIMIZE_INLINING=y to guarantee 2 > 4. __builtin_compatible_p inside that inline function The __always_inline requirement is not true. In fact, if you look at the example testcase filed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646#c7 you'll see it uses __builtin_compatible_p in an __always inline function that is called from one that is not tagged with that attribute. And generally speaking, always inline is never a requirement, any call or chain of calls that the inliner can decide to inline can lead to the bug (if it complies with the condition below). What is a requirement, though, is that __builtin_compatible_p is called on something passed in an argument by reference or in an aggregate (i.e. struct or array) argument. So, int foo1 (unsigned long *ref) { if (__builtin_constant (*ref)) ... else /* wrongly unreachable code */ } can lead to this issue, as can int foo2 (struct S s) { if ((__builtin_constant (s.l)) ... else /* wrongly unreachable code */ } but int foo3 (unsigned long val) { if (__builtin_constant (val)) ... else /* all OK */ } cannot, and is fine. But please note that wrapping a foo[12]-like function into a dereferencing wrapper might not help if foo[12] would be early-inlined into such wrapper (GCC has two inliners, a very simple early-inliner that only handles simple cases and a full-blown IPA inliner that contains the bug). I believe this can be ensured by making the wrapper always_inline and never calling it indirectly (via a pointer). Honza (CCed), you know inlining heuristics better, please correct me if my last statement is somehow inaccurate (or indeed if you have a better idea how kernel developers can make sure they do not hit the bug). Thanks, Martin > > The last point is what Denys introduced in the kernel with > bc27fb68aaad ("include/uapi/linux/byteorder, swab: force inlining of some > byteswap operations"). So maybe it's better after all to revert that > patch, to have a higher confidence in the same bug not appearing > elsewhere. It's also really a workaround for another quirk of the > compiler, but that one only results in duplicated functions in object > code rather than functions that end in the middle. > > Arnd > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday 27 April 2016 13:05:03 Martin Jambor wrote: > On Tue, Apr 26, 2016 at 05:58:20PM +0200, Arnd Bergmann wrote: > > On Tuesday 26 April 2016 09:06:54 Martin K. Petersen wrote: > > > >>>>> "Arnd" == Arnd Bergmann <arnd@arndb.de> writes: > > > > > > Arnd> I don't think we can realistically blacklist gcc-4.9.{0,1,2,3}, > > > Arnd> gcc-5.{0,1,2,3}.* and gcc-6.0 and require everyone to upgrade to > > > Arnd> compilers that have not been released yet in order to build a > > > Arnd> linux-4.6 kernel. > > > > > > I agree that compiler blacklisting is problematic and I'd like to avoid > > > it. The question is how far we go in the kernel to accommodate various > > > levels of brokenness. > > > > > > In any case. Sticking compiler workarounds in device driver code is akin > > > to putting demolition orders on display on Alpha Centauri. At the very > > > minimum the patch should put a fat comment in the code stating that > > > these wrapper functions or #defines should not be changed in the future > > > because that'll break builds using gcc XYZ. But that does not solve the > > > problem for anybody else that might be doing something similar. > > > Converting between u64 and $RANDOM_TYPE in an inline wrapper does not > > > seem like a rare and unusual programming pattern. > > > > It's not the driver really, it's the core scsi/fc layer, which makes > > it a little dangerous that a random driver. > > > > I agree that putting a comment in would also help. What I understand > > from the bug report is that to trigger this bug you need these elements: > > > > 1. an inline function marked __always_inline > > 2. another inline function that is automatically inlined (not __always_inline) > > 3. CONFIG_OPTIMIZE_INLINING=y to guarantee 2 > > 4. __builtin_compatible_p inside that inline function > > The __always_inline requirement is not true. In fact, if you look at > the example testcase filed in > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646#c7 you'll see it > uses __builtin_compatible_p in an __always inline function that is > called from one that is not tagged with that attribute. > > And generally speaking, always inline is never a requirement, any call > or chain of calls that the inliner can decide to inline can lead to > the bug (if it complies with the condition below). Ok, thanks for the clarification, I thought you always had to have both kinds of inline functions. > What is a requirement, though, is that __builtin_compatible_p is > called on something passed in an argument by reference or in an > aggregate (i.e. struct or array) argument. > > So, > > int foo1 (unsigned long *ref) > { > if (__builtin_constant (*ref)) > ... > else > /* wrongly unreachable code */ > } > > } > > cannot, and is fine. But please note that wrapping a foo[12]-like > function into a dereferencing wrapper might not help if foo[12] would > be early-inlined into such wrapper (GCC has two inliners, a very > simple early-inliner that only handles simple cases and a full-blown > IPA inliner that contains the bug). I believe this can be ensured by > making the wrapper always_inline and never calling it indirectly (via > a pointer). Honza (CCed), you know inlining heuristics better, please > correct me if my last statement is somehow inaccurate (or indeed if > you have a better idea how kernel developers can make sure they do not > hit the bug). I guess that means that any user of this code in the kernel: static inline __attribute_const__ __u64 __fswab64(__u64 val) { #ifdef __HAVE_BUILTIN_BSWAP64__ return __builtin_bswap64(val); #elif defined (__arch_swab64) return __arch_swab64(val); #elif defined(__SWAB_64_THRU_32__) __u32 h = val >> 32; __u32 l = val & ((1ULL << 32) - 1); return (((__u64)__fswab32(l)) << 32) | ((__u64)(__fswab32(h))); #else return ___constant_swab64(val); #endif } #define __swab64(x) \ (__builtin_constant_p((__u64)(x)) ? \ ___constant_swab64(x) : \ __fswab64(x)) static __always_inline __u64 __swab64p(const __u64 *p) { #ifdef __arch_swab64p return __arch_swab64p(p); #else return __swab64(*p); #endif } has a chance of running into the same problem, and we may want to solve it at the root. For architectures that define __HAVE_BUILTIN_BSWAP64__ (i.e. ARM, MIPS, POWERPC, S390, and x86 with gcc-4.4 or higher, 4.8 for __HAVE_BUILTIN_BSWAP16__), we can probably just change the logic to avoid __builtin_constant_p() and always use __builtin_bswap64(). This won't help on TILE, which is the one architecture that sets ARCH_SUPPORTS_OPTIMIZED_INLINING but does not set ARCH_USE_BUILTIN_BSWAP. Chris Metcalf should be able to figure out whether we can just set ARCH_USE_BUILTIN_BSWAP for tile as well. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
(Resending as text/plain) On 4/27/2016 5:34 PM, Arnd Bergmann wrote: > This won't help on TILE, which is the one architecture that sets > ARCH_SUPPORTS_OPTIMIZED_INLINING but does not set ARCH_USE_BUILTIN_BSWAP. > Chris Metcalf should be able to figure out whether we can just > set ARCH_USE_BUILTIN_BSWAP for tile as well. We certainly could enable ARCH_USE_BUILTIN_BSWAP. The only problem is that we never added explicit support for bswap16() in gcc, which is efficiently done on tilegx via the "revbytes" instruction and a 48-bit right-shift. So gcc instead does a generic thing with four instructions in three bundles, so really not as good as our asm/swab.h. I'm not sure how to weigh the implications of converting to builtin_bswap16 (and possibly upstreaming a better implementation to gcc), vs. disabling ARCH_SUPPORTS_OPTIMIZED_INLINING (which no one else but x86 uses anyway), vs. just ignoring the compiler bug and hoping it's not an issue in practice :-)
On Thursday 28 April 2016 10:58:43 Chris Metcalf wrote: > (Resending as text/plain) > > On 4/27/2016 5:34 PM, Arnd Bergmann wrote: > > This won't help on TILE, which is the one architecture that sets > > ARCH_SUPPORTS_OPTIMIZED_INLINING but does not set ARCH_USE_BUILTIN_BSWAP. > > Chris Metcalf should be able to figure out whether we can just > > set ARCH_USE_BUILTIN_BSWAP for tile as well. > > We certainly could enable ARCH_USE_BUILTIN_BSWAP. The only problem is > that we never added explicit support for bswap16() in gcc, which is > efficiently done on tilegx via the "revbytes" instruction and a 48-bit > right-shift. So gcc instead does a generic thing with four > instructions in three bundles, so really not as good as our asm/swab.h. > > I'm not sure how to weigh the implications of converting to > builtin_bswap16 (and possibly upstreaming a better implementation to > gcc), vs. disabling ARCH_SUPPORTS_OPTIMIZED_INLINING (which no one > else but x86 uses anyway), vs. just ignoring the compiler bug and > hoping it's not an issue in practice How about figuring out whether you hit the gcc bug on tile as a first step? Another idea would be to adapt this section in include/linux/compiler-gcc.h: #if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) || \ !defined(CONFIG_OPTIMIZE_INLINING) || (__GNUC__ < 4) #define inline inline __attribute__((always_inline)) notrace #define __inline__ __inline__ __attribute__((always_inline)) notrace #define __inline __inline __attribute__((always_inline)) notrace #else /* A lot of inline functions can cause havoc with function tracing */ #define inline inline notrace #define __inline__ __inline__ notrace #define __inline __inline notrace #endif to work around the issue. We already check for gcc before 4.0, and we could also check for the affected releases (4.9, 5.x, 6.1) in the same place, possibly conditional on ARCH_USE_BUILTIN_BSWAP with a comment pointing to the gcc bug tracker. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 4/28/2016 11:23 AM, Arnd Bergmann wrote: > On Thursday 28 April 2016 10:58:43 Chris Metcalf wrote: >> (Resending as text/plain) >> >> On 4/27/2016 5:34 PM, Arnd Bergmann wrote: >>> This won't help on TILE, which is the one architecture that sets >>> ARCH_SUPPORTS_OPTIMIZED_INLINING but does not set ARCH_USE_BUILTIN_BSWAP. >>> Chris Metcalf should be able to figure out whether we can just >>> set ARCH_USE_BUILTIN_BSWAP for tile as well. >> We certainly could enable ARCH_USE_BUILTIN_BSWAP. The only problem is >> that we never added explicit support for bswap16() in gcc, which is >> efficiently done on tilegx via the "revbytes" instruction and a 48-bit >> right-shift. So gcc instead does a generic thing with four >> instructions in three bundles, so really not as good as our asm/swab.h. >> >> I'm not sure how to weigh the implications of converting to >> builtin_bswap16 (and possibly upstreaming a better implementation to >> gcc), vs. disabling ARCH_SUPPORTS_OPTIMIZED_INLINING (which no one >> else but x86 uses anyway), vs. just ignoring the compiler bug and >> hoping it's not an issue in practice > How about figuring out whether you hit the gcc bug on tile as a > first step? I don't have an affected build of gcc handy (just 4.8 and 4.4). I will pass this to our compiler folks and see what they know. > Another idea would be to adapt this section in include/linux/compiler-gcc.h: > > #if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) || \ > !defined(CONFIG_OPTIMIZE_INLINING) || (__GNUC__ < 4) > #define inline inline __attribute__((always_inline)) notrace > #define __inline__ __inline__ __attribute__((always_inline)) notrace > #define __inline __inline __attribute__((always_inline)) notrace > #else > /* A lot of inline functions can cause havoc with function tracing */ > #define inline inline notrace > #define __inline__ __inline__ notrace > #define __inline __inline notrace > #endif > > to work around the issue. We already check for gcc before 4.0, and > we could also check for the affected releases (4.9, 5.x, 6.1) in the > same place, possibly conditional on ARCH_USE_BUILTIN_BSWAP with > a comment pointing to the gcc bug tracker. This does seem like a more robust solution anyway, since more instances of the bad inline pattern might get introduced in the future in other places. I wouldn't make it conditional on ARCH_USE_BUILTIN_BSWAP for the same reason.
diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h index bf66ea6..1919cd4 100644 --- a/include/scsi/scsi_transport_fc.h +++ b/include/scsi/scsi_transport_fc.h @@ -796,12 +796,12 @@ fc_remote_port_chkready(struct fc_rport *rport) return result; } -static inline u64 wwn_to_u64(u8 *wwn) +static __always_inline u64 wwn_to_u64(u8 *wwn) { return get_unaligned_be64(wwn); } -static inline void u64_to_wwn(u64 inm, u8 *wwn) +static __always_inline void u64_to_wwn(u64 inm, u8 *wwn) { put_unaligned_be64(inm, wwn); }
objtool reports [1] the following warning: drivers/scsi/qla2xxx/qla_attr.o: warning: objtool: qla2x00_get_host_fabric_name() falls through to next function qla2x00_get_starget_port_name() This warning is due to a gcc bug [2] which causes corrupt code: 0000000000002f53 <qla2x00_get_host_fabric_name>: 2f53: 55 push %rbp 2f54: 48 89 e5 mov %rsp,%rbp 0000000000002f57 <qla2x00_get_fc_host_stats>: 2f57: 55 push %rbp 2f58: b9 e8 00 00 00 mov $0xe8,%ecx 2f5d: 48 89 e5 mov %rsp,%rbp ... Note that qla2x00_get_host_fabric_name() is inexplicably truncated after setting up the frame pointer. It falls through to the next function, which is very bad. It occurs with the combination of the following two recent commits: bc27fb68aaad ("include/uapi/linux/byteorder, swab: force inlining of some byteswap operations") ef3fb2422ffe ("scsi: fc: use get/put_unaligned64 for wwn access") The call chain which appears to trigger the problem is: qla2x00_get_host_fabric_name() wwn_to_u64() get_unaligned_be64() be64_to_cpup() __be64_to_cpup() The bug requires very specific conditions to trigger. According to Martin Jambor (from the gcc bugzilla): "This bug can occur when an inlineable function containing a call to __builtin_constant_p, which checks a parameter or a value it references and a (possibly indirect) caller of the function actually passes a constant, but stores it using a type of a different size." There's no reliable way to avoid (or even detect) the bug. Until it gets fixed in released versions of gcc, the least intrusive workaround for this particular issue is to force the wwn conversion functions to be inlined. [1] https://lists.01.org/pipermail/kbuild-all/2016-April/019579.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646 Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> --- include/scsi/scsi_transport_fc.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)