[4/5] umh: fix processed error when UMH_WAIT_PROC is used

Message ID	20200610154923.27510-5-mcgrof@kernel.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=S27I=7X=vger.kernel.org=linux-nfs-owner@kernel.org> From: "Luis R. Rodriguez" <mcgrof@kernel.org> To: gregkh@linuxfoundation.org, viro@zeniv.linux.org.uk, philipp.reisner@linbit.com, lars.ellenberg@linbit.com, axboe@kernel.dk, bfields@fieldses.org, chuck.lever@oracle.com, roopa@cumulusnetworks.com, nikolay@cumulusnetworks.com, davem@davemloft.net, kuba@kernel.org, dhowells@redhat.com, jarkko.sakkinen@linux.intel.com, jmorris@namei.org, serge@hallyn.com, christian.brauner@ubuntu.com Cc: slyfox@gentoo.org, ast@kernel.org, keescook@chromium.org, josh@joshtriplett.org, ravenexp@gmail.com, chainsaw@gentoo.org, linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org, bridge@lists.linux-foundation.org, keyrings@vger.kernel.org, linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org, Luis Chamberlain <mcgrof@kernel.org>, Tiezhu Yang <yangtiezhu@loongson.cn> Subject: [PATCH 4/5] umh: fix processed error when UMH_WAIT_PROC is used Date: Wed, 10 Jun 2020 15:49:22 +0000 Message-Id: <20200610154923.27510-5-mcgrof@kernel.org> In-Reply-To: <20200610154923.27510-1-mcgrof@kernel.org> References: <20200610154923.27510-1-mcgrof@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk
Series	kmod/umh: a few fixes \| expand [0/5] kmod/umh: a few fixes [1/5] selftests: kmod: Use variable NAME in kmod_test_0001() [2/5] kmod: Remove redundant "be an" in the comment [3/5] test_kmod: Avoid potential double free in trigger_config_run_type() [4/5] umh: fix processed error when UMH_WAIT_PROC is used [5/5] selftests: simplify kmod failure value

Luis Chamberlain June 10, 2020, 3:49 p.m. UTC

From: Luis Chamberlain <mcgrof@kernel.org>

When UMH_WAIT_PROC is used we call kernel_wait4(). This is
the *only* place in the kernel where we actually inspect the
error code. Prior to this patch we returned the value from the
wait call, and that technically requires us to use wrappers
such as WEXITSTATUS(). We either fix all callers to start
using WEXITSTATUS() and friends *or* we do address this within
the umh code and let the callers get the actual error code.

The way we use kernel_wait4() on the umh is with the options
set to 0, and when this is done the wait call only waits for
terminated children. Because of this, there is no point to
complicate checks for the umh with W*() calls. That would
make the checks complex, redundant, and simply not needed.

By making the umh do the checks for us we keep users
kernel_wait4() at bay, and promote avoiding introduction of
further W*() macros and the complexities this can bring.

There were only a few callers which properly checked for
the error status using open-coded solutions. We remove
them as they are no longer needed, and also remove open
coded implicit uses of W*() uses which should never
trigger given that the options passed to wait is 0.

The only helpers we really need are for termination, so we
just include those, and we prefix our W*() helpers with K.

Since all this does is *correct* an error code, if one
was found, this change only fixes reporting the *correct*
error, and there are two places where this matters, and
which this patch fixes:

  * request_module() used to fail with an error code of
    256 when a module was not found. Now it properly
    returns 1.

  * fs/nfsd/nfs4recover.c: we never were disabling the
    upcall as the error code of -ENOENT or -EACCES was
    *never* properly checked for.

Reported-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 drivers/block/drbd/drbd_nl.c | 20 ++++++++------------
 fs/nfsd/nfs4recover.c        |  2 +-
 include/linux/sched/task.h   | 13 +++++++++++++
 kernel/umh.c                 |  4 ++--
 net/bridge/br_stp_if.c       | 10 ++--------
 security/keys/request_key.c  |  2 +-
 6 files changed, 27 insertions(+), 24 deletions(-)

Christian Borntraeger June 23, 2020, 2:11 p.m. UTC | #1

Jens Markwardt reported a regression in the linux-next runs.  with "umh: fix
processed error when UMH_WAIT_PROC is used" (from linux-next) a linux bridge
with an KVM guests no longer activates :

without patch
# ip addr show dev virbr1
6: virbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 52:54:00:1e:3f:c0 brd ff:ff:ff:ff:ff:ff
    inet 192.168.254.254/24 brd 192.168.254.255 scope global virbr1
       valid_lft forever preferred_lft forever

with this patch the bridge stays DOWN with NO-CARRIER

# ip addr show dev virbr1
6: virbr1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 52:54:00:1e:3f:c0 brd ff:ff:ff:ff:ff:ff
    inet 192.168.254.254/24 brd 192.168.254.255 scope global virbr1
       valid_lft forever preferred_lft forever

This was bisected in linux-next. Reverting from linux-next also fixes the issue.

Any idea?

Christian

Christian Borntraeger June 23, 2020, 2:23 p.m. UTC | #2

On 23.06.20 16:11, Christian Borntraeger wrote:
> Jens Markwardt reported a regression in the linux-next runs.  with "umh: fix
> processed error when UMH_WAIT_PROC is used" (from linux-next) a linux bridge
> with an KVM guests no longer activates :
> 
> without patch
> # ip addr show dev virbr1
> 6: virbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
>     link/ether 52:54:00:1e:3f:c0 brd ff:ff:ff:ff:ff:ff
>     inet 192.168.254.254/24 brd 192.168.254.255 scope global virbr1
>        valid_lft forever preferred_lft forever
> 
> with this patch the bridge stays DOWN with NO-CARRIER
> 
> # ip addr show dev virbr1
> 6: virbr1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
>     link/ether 52:54:00:1e:3f:c0 brd ff:ff:ff:ff:ff:ff
>     inet 192.168.254.254/24 brd 192.168.254.255 scope global virbr1
>        valid_lft forever preferred_lft forever
> 
> This was bisected in linux-next. Reverting from linux-next also fixes the issue.
> 
> Any idea?

FWIW, s390 is big endian. Maybe some of the shifts inn the __KW* macros are wrong.

Christian Borntraeger June 24, 2020, 11:11 a.m. UTC | #3

On 23.06.20 16:23, Christian Borntraeger wrote:
> 
> 
> On 23.06.20 16:11, Christian Borntraeger wrote:
>> Jens Markwardt reported a regression in the linux-next runs.  with "umh: fix
>> processed error when UMH_WAIT_PROC is used" (from linux-next) a linux bridge
>> with an KVM guests no longer activates :
>>
>> without patch
>> # ip addr show dev virbr1
>> 6: virbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
>>     link/ether 52:54:00:1e:3f:c0 brd ff:ff:ff:ff:ff:ff
>>     inet 192.168.254.254/24 brd 192.168.254.255 scope global virbr1
>>        valid_lft forever preferred_lft forever
>>
>> with this patch the bridge stays DOWN with NO-CARRIER
>>
>> # ip addr show dev virbr1
>> 6: virbr1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
>>     link/ether 52:54:00:1e:3f:c0 brd ff:ff:ff:ff:ff:ff
>>     inet 192.168.254.254/24 brd 192.168.254.255 scope global virbr1
>>        valid_lft forever preferred_lft forever
>>
>> This was bisected in linux-next. Reverting from linux-next also fixes the issue.
>>
>> Any idea?
> 
> FWIW, s390 is big endian. Maybe some of the shifts inn the __KW* macros are wrong.

Does anyone have an idea why "umh: fix processed error when UMH_WAIT_PROC is used" breaks the
linux-bridge on s390?

Luis Chamberlain June 24, 2020, 12:05 p.m. UTC | #4

On Wed, Jun 24, 2020 at 01:11:54PM +0200, Christian Borntraeger wrote:
> 
> 
> On 23.06.20 16:23, Christian Borntraeger wrote:
> > 
> > 
> > On 23.06.20 16:11, Christian Borntraeger wrote:
> >> Jens Markwardt reported a regression in the linux-next runs.  with "umh: fix
> >> processed error when UMH_WAIT_PROC is used" (from linux-next) a linux bridge
> >> with an KVM guests no longer activates :
> >>
> >> without patch
> >> # ip addr show dev virbr1
> >> 6: virbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
> >>     link/ether 52:54:00:1e:3f:c0 brd ff:ff:ff:ff:ff:ff
> >>     inet 192.168.254.254/24 brd 192.168.254.255 scope global virbr1
> >>        valid_lft forever preferred_lft forever
> >>
> >> with this patch the bridge stays DOWN with NO-CARRIER
> >>
> >> # ip addr show dev virbr1
> >> 6: virbr1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
> >>     link/ether 52:54:00:1e:3f:c0 brd ff:ff:ff:ff:ff:ff
> >>     inet 192.168.254.254/24 brd 192.168.254.255 scope global virbr1
> >>        valid_lft forever preferred_lft forever
> >>
> >> This was bisected in linux-next. Reverting from linux-next also fixes the issue.
> >>
> >> Any idea?
> > 
> > FWIW, s390 is big endian. Maybe some of the shifts inn the __KW* macros are wrong.
> 
> Does anyone have an idea why "umh: fix processed error when UMH_WAIT_PROC is used" breaks the
> linux-bridge on s390?

glibc for instance defines __WEXITSTATUS in only one location: bits/waitstatus.h
and it does not special case it per architecture, so at this point I'd
have to say we have to look somewhere else for why this is happening.

The commmit which caused this is issuing a correct error code down the
pipeline, nothing more. I'll make taking a look at this a priority right
now. Let us see what I come up with today.

  Luis

Luis Chamberlain June 24, 2020, 1:17 p.m. UTC | #5

Martin, your eyeballs would be appreciated for a bit on this.

On Wed, Jun 24, 2020 at 12:05:46PM +0000, Luis Chamberlain wrote:
> On Wed, Jun 24, 2020 at 01:11:54PM +0200, Christian Borntraeger wrote:
> > 
> > 
> > On 23.06.20 16:23, Christian Borntraeger wrote:
> > > 
> > > 
> > > On 23.06.20 16:11, Christian Borntraeger wrote:
> > >> Jens Markwardt reported a regression in the linux-next runs.  with "umh: fix
> > >> processed error when UMH_WAIT_PROC is used" (from linux-next) a linux bridge
> > >> with an KVM guests no longer activates :
> > >>
> > >> without patch
> > >> # ip addr show dev virbr1
> > >> 6: virbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
> > >>     link/ether 52:54:00:1e:3f:c0 brd ff:ff:ff:ff:ff:ff
> > >>     inet 192.168.254.254/24 brd 192.168.254.255 scope global virbr1
> > >>        valid_lft forever preferred_lft forever
> > >>
> > >> with this patch the bridge stays DOWN with NO-CARRIER
> > >>
> > >> # ip addr show dev virbr1
> > >> 6: virbr1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
> > >>     link/ether 52:54:00:1e:3f:c0 brd ff:ff:ff:ff:ff:ff
> > >>     inet 192.168.254.254/24 brd 192.168.254.255 scope global virbr1
> > >>        valid_lft forever preferred_lft forever
> > >>
> > >> This was bisected in linux-next. Reverting from linux-next also fixes the issue.
> > >>
> > >> Any idea?
> > > 
> > > FWIW, s390 is big endian. Maybe some of the shifts inn the __KW* macros are wrong.
> > 
> > Does anyone have an idea why "umh: fix processed error when UMH_WAIT_PROC is used" breaks the
> > linux-bridge on s390?
> 
> glibc for instance defines __WEXITSTATUS in only one location: bits/waitstatus.h
> and it does not special case it per architecture, so at this point I'd
> have to say we have to look somewhere else for why this is happening.

I found however an LTP bug indicating the need to test for
s390 wait macros [0] in light of a recent bug in glibc for s390.
I am asking for references to that issue given I cannot find
any mention of this on glibc yet.

I'm in hopes Martin might be aware of that mentioned s390 glic bug.

[0] https://github.com/linux-test-project/ltp/issues/605

  Luis

Christoph Hellwig June 24, 2020, 2:43 p.m. UTC | #6

On Wed, Jun 24, 2020 at 01:11:54PM +0200, Christian Borntraeger wrote:
> Does anyone have an idea why "umh: fix processed error when UMH_WAIT_PROC is used" breaks the
> linux-bridge on s390?

Are we even sure this is s390 specific and doesn't happen on other
architectures with the same bridge setup?

Christian Borntraeger June 24, 2020, 3:54 p.m. UTC | #7

On 24.06.20 16:43, Christoph Hellwig wrote:
> On Wed, Jun 24, 2020 at 01:11:54PM +0200, Christian Borntraeger wrote:
>> Does anyone have an idea why "umh: fix processed error when UMH_WAIT_PROC is used" breaks the
>> linux-bridge on s390?
> 
> Are we even sure this is s390 specific and doesn't happen on other
> architectures with the same bridge setup?

Fair point. AFAIK nobody has tested this yet on x86.

Luis Chamberlain June 24, 2020, 4:09 p.m. UTC | #8

On Wed, Jun 24, 2020 at 05:54:46PM +0200, Christian Borntraeger wrote:
> 
> 
> On 24.06.20 16:43, Christoph Hellwig wrote:
> > On Wed, Jun 24, 2020 at 01:11:54PM +0200, Christian Borntraeger wrote:
> >> Does anyone have an idea why "umh: fix processed error when UMH_WAIT_PROC is used" breaks the
> >> linux-bridge on s390?
> > 
> > Are we even sure this is s390 specific and doesn't happen on other
> > architectures with the same bridge setup?
> 
> Fair point. AFAIK nobody has tested this yet on x86.

Regardless, can you enable dynamic debug prints, to see if the kernel
reveals anything on the bridge code which may be relevant:

echo "file net/bridge/* +p" > /sys/kernel/debug/dynamic_debug/control

  Luis

Luis Chamberlain June 24, 2020, 4:13 p.m. UTC | #9

On Wed, Jun 24, 2020 at 01:17:25PM +0000, Luis Chamberlain wrote:
> I found however an LTP bug indicating the need to test for
> s390 wait macros [0] in light of a recent bug in glibc for s390.
> I am asking for references to that issue given I cannot find
> any mention of this on glibc yet.
> 
> [0] https://github.com/linux-test-project/ltp/issues/605

I looked into this and the bug associated was:

https://sourceware.org/bugzilla/show_bug.cgi?id=19613

The commit in question was upstream glibc commit
b49ab5f4503f36dcbf43f821f817da66b2931fe6 ("Remove union wait [BZ
#19613]"), and while I don't see anything s390 mentioned there,
the issue there was due to the caller of the wait using a long
instead of an int for the return value.

In other words, that'd not the droid we are looking for.

So the issue is something else.

  Luis

Christian Borntraeger June 24, 2020, 5:58 p.m. UTC | #10

On 24.06.20 18:09, Luis Chamberlain wrote:
> On Wed, Jun 24, 2020 at 05:54:46PM +0200, Christian Borntraeger wrote:
>>
>>
>> On 24.06.20 16:43, Christoph Hellwig wrote:
>>> On Wed, Jun 24, 2020 at 01:11:54PM +0200, Christian Borntraeger wrote:
>>>> Does anyone have an idea why "umh: fix processed error when UMH_WAIT_PROC is used" breaks the
>>>> linux-bridge on s390?
>>>
>>> Are we even sure this is s390 specific and doesn't happen on other
>>> architectures with the same bridge setup?
>>
>> Fair point. AFAIK nobody has tested this yet on x86.
> 
> Regardless, can you enable dynamic debug prints, to see if the kernel
> reveals anything on the bridge code which may be relevant:
> 
> echo "file net/bridge/* +p" > /sys/kernel/debug/dynamic_debug/control
> 
>   Luis

When I start a guest the following happens with the patch:

[   47.420237] virbr0: port 2(vnet0) entered blocking state
[   47.420242] virbr0: port 2(vnet0) entered disabled state
[   47.420315] device vnet0 entered promiscuous mode
[   47.420365] virbr0: port 2(vnet0) event 16
[   47.420366] virbr0: br_fill_info event 16 port vnet0 master virbr0
[   47.420373] virbr0: toggle option: 12 state: 0 -> 0
[   47.420536] virbr0: port 2(vnet0) entered blocking state
[   47.420538] virbr0: port 2(vnet0) event 16
[   47.420539] virbr0: br_fill_info event 16 port vnet0 master virbr0

and the nothing happens.


without the patch
[   33.805410] virbr0: hello timer expired
[   35.805413] virbr0: hello timer expired
[   36.184349] virbr0: port 2(vnet0) entered blocking state
[   36.184353] virbr0: port 2(vnet0) entered disabled state
[   36.184427] device vnet0 entered promiscuous mode
[   36.184479] virbr0: port 2(vnet0) event 16
[   36.184480] virbr0: br_fill_info event 16 port vnet0 master virbr0
[   36.184487] virbr0: toggle option: 12 state: 0 -> 0
[   36.184636] virbr0: port 2(vnet0) entered blocking state
[   36.184638] virbr0: port 2(vnet0) entered listening state
[   36.184639] virbr0: port 2(vnet0) event 16
[   36.184640] virbr0: br_fill_info event 16 port vnet0 master virbr0
[   36.184645] virbr0: port 2(vnet0) event 16
[   36.184646] virbr0: br_fill_info event 16 port vnet0 master virbr0
[   37.805478] virbr0: hello timer expired
[   38.205413] virbr0: port 2(vnet0) forward delay timer
[   38.205414] virbr0: port 2(vnet0) entered learning state
[   38.205427] virbr0: port 2(vnet0) event 16
[   38.205430] virbr0: br_fill_info event 16 port vnet0 master virbr0
[   38.765414] virbr0: port 2(vnet0) hold timer expired
[   39.805415] virbr0: hello timer expired
[   40.285410] virbr0: port 2(vnet0) forward delay timer
[   40.285411] virbr0: port 2(vnet0) entered forwarding state
[   40.285418] virbr0: topology change detected, propagating
[   40.285420] virbr0: decreasing ageing time to 400
[   40.285427] virbr0: port 2(vnet0) event 16
[   40.285432] virbr0: br_fill_info event 16 port vnet0 master virbr0
[   40.765408] virbr0: port 2(vnet0) hold timer expired
[   41.805415] virbr0: hello timer expired
[   42.765426] virbr0: port 2(vnet0) hold timer expired
[   43.805425] virbr0: hello timer expired
[   44.765426] virbr0: port 2(vnet0) hold timer expired
[   45.805418] virbr0: hello timer expired

and continuing....

Christian Borntraeger June 24, 2020, 6:09 p.m. UTC | #11

On 24.06.20 19:58, Christian Borntraeger wrote:
> 
> 
> On 24.06.20 18:09, Luis Chamberlain wrote:
>> On Wed, Jun 24, 2020 at 05:54:46PM +0200, Christian Borntraeger wrote:
>>>
>>>
>>> On 24.06.20 16:43, Christoph Hellwig wrote:
>>>> On Wed, Jun 24, 2020 at 01:11:54PM +0200, Christian Borntraeger wrote:
>>>>> Does anyone have an idea why "umh: fix processed error when UMH_WAIT_PROC is used" breaks the
>>>>> linux-bridge on s390?
>>>>
>>>> Are we even sure this is s390 specific and doesn't happen on other
>>>> architectures with the same bridge setup?
>>>
>>> Fair point. AFAIK nobody has tested this yet on x86.
>>
>> Regardless, can you enable dynamic debug prints, to see if the kernel
>> reveals anything on the bridge code which may be relevant:
>>
>> echo "file net/bridge/* +p" > /sys/kernel/debug/dynamic_debug/control
>>
>>   Luis
> 
> When I start a guest the following happens with the patch:
> 
> [   47.420237] virbr0: port 2(vnet0) entered blocking state
> [   47.420242] virbr0: port 2(vnet0) entered disabled state
> [   47.420315] device vnet0 entered promiscuous mode
> [   47.420365] virbr0: port 2(vnet0) event 16
> [   47.420366] virbr0: br_fill_info event 16 port vnet0 master virbr0
> [   47.420373] virbr0: toggle option: 12 state: 0 -> 0
> [   47.420536] virbr0: port 2(vnet0) entered blocking state
> [   47.420538] virbr0: port 2(vnet0) event 16
> [   47.420539] virbr0: br_fill_info event 16 port vnet0 master virbr0
> 
> and the nothing happens.
> 
> 
> without the patch
> [   33.805410] virbr0: hello timer expired
> [   35.805413] virbr0: hello timer expired
> [   36.184349] virbr0: port 2(vnet0) entered blocking state
> [   36.184353] virbr0: port 2(vnet0) entered disabled state
> [   36.184427] device vnet0 entered promiscuous mode
> [   36.184479] virbr0: port 2(vnet0) event 16
> [   36.184480] virbr0: br_fill_info event 16 port vnet0 master virbr0
> [   36.184487] virbr0: toggle option: 12 state: 0 -> 0
> [   36.184636] virbr0: port 2(vnet0) entered blocking state
> [   36.184638] virbr0: port 2(vnet0) entered listening state
> [   36.184639] virbr0: port 2(vnet0) event 16
> [   36.184640] virbr0: br_fill_info event 16 port vnet0 master virbr0
> [   36.184645] virbr0: port 2(vnet0) event 16
> [   36.184646] virbr0: br_fill_info event 16 port vnet0 master virbr0
> [   37.805478] virbr0: hello timer expired
> [   38.205413] virbr0: port 2(vnet0) forward delay timer
> [   38.205414] virbr0: port 2(vnet0) entered learning state
> [   38.205427] virbr0: port 2(vnet0) event 16
> [   38.205430] virbr0: br_fill_info event 16 port vnet0 master virbr0
> [   38.765414] virbr0: port 2(vnet0) hold timer expired
> [   39.805415] virbr0: hello timer expired
> [   40.285410] virbr0: port 2(vnet0) forward delay timer
> [   40.285411] virbr0: port 2(vnet0) entered forwarding state
> [   40.285418] virbr0: topology change detected, propagating
> [   40.285420] virbr0: decreasing ageing time to 400
> [   40.285427] virbr0: port 2(vnet0) event 16
> [   40.285432] virbr0: br_fill_info event 16 port vnet0 master virbr0
> [   40.765408] virbr0: port 2(vnet0) hold timer expired
> [   41.805415] virbr0: hello timer expired
> [   42.765426] virbr0: port 2(vnet0) hold timer expired
> [   43.805425] virbr0: hello timer expired
> [   44.765426] virbr0: port 2(vnet0) hold timer expired
> [   45.805418] virbr0: hello timer expired
> 
> and continuing....

Just reverting the umh.c parts like this makes the problem go away.

diff --git a/kernel/umh.c b/kernel/umh.c
index f81e8698e36e..79f139a7ca03 100644
--- a/kernel/umh.c
+++ b/kernel/umh.c
@@ -154,8 +154,8 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
                 * the real error code is already in sub_info->retval or
                 * sub_info->retval is 0 anyway, so don't mess with it then.
                 */
-               if (KWIFEXITED(ret))
-                       sub_info->retval = KWEXITSTATUS(ret);
+               if (ret)
+                       sub_info->retval = ret;
        }
 
        /* Restore default kernel sig handler */

Christian Borntraeger June 24, 2020, 6:32 p.m. UTC | #12

On 24.06.20 20:09, Christian Borntraeger wrote:
> 
> 
> On 24.06.20 19:58, Christian Borntraeger wrote:
>>
>>
>> On 24.06.20 18:09, Luis Chamberlain wrote:
>>> On Wed, Jun 24, 2020 at 05:54:46PM +0200, Christian Borntraeger wrote:
>>>>
>>>>
>>>> On 24.06.20 16:43, Christoph Hellwig wrote:
>>>>> On Wed, Jun 24, 2020 at 01:11:54PM +0200, Christian Borntraeger wrote:
>>>>>> Does anyone have an idea why "umh: fix processed error when UMH_WAIT_PROC is used" breaks the
>>>>>> linux-bridge on s390?
>>>>>
>>>>> Are we even sure this is s390 specific and doesn't happen on other
>>>>> architectures with the same bridge setup?
>>>>
>>>> Fair point. AFAIK nobody has tested this yet on x86.
>>>
>>> Regardless, can you enable dynamic debug prints, to see if the kernel
>>> reveals anything on the bridge code which may be relevant:
>>>
>>> echo "file net/bridge/* +p" > /sys/kernel/debug/dynamic_debug/control
>>>
>>>   Luis
>>
>> When I start a guest the following happens with the patch:
>>
>> [   47.420237] virbr0: port 2(vnet0) entered blocking state
>> [   47.420242] virbr0: port 2(vnet0) entered disabled state
>> [   47.420315] device vnet0 entered promiscuous mode
>> [   47.420365] virbr0: port 2(vnet0) event 16
>> [   47.420366] virbr0: br_fill_info event 16 port vnet0 master virbr0
>> [   47.420373] virbr0: toggle option: 12 state: 0 -> 0
>> [   47.420536] virbr0: port 2(vnet0) entered blocking state
>> [   47.420538] virbr0: port 2(vnet0) event 16
>> [   47.420539] virbr0: br_fill_info event 16 port vnet0 master virbr0
>>
>> and the nothing happens.
>>
>>
>> without the patch
>> [   33.805410] virbr0: hello timer expired
>> [   35.805413] virbr0: hello timer expired
>> [   36.184349] virbr0: port 2(vnet0) entered blocking state
>> [   36.184353] virbr0: port 2(vnet0) entered disabled state
>> [   36.184427] device vnet0 entered promiscuous mode
>> [   36.184479] virbr0: port 2(vnet0) event 16
>> [   36.184480] virbr0: br_fill_info event 16 port vnet0 master virbr0
>> [   36.184487] virbr0: toggle option: 12 state: 0 -> 0
>> [   36.184636] virbr0: port 2(vnet0) entered blocking state
>> [   36.184638] virbr0: port 2(vnet0) entered listening state
>> [   36.184639] virbr0: port 2(vnet0) event 16
>> [   36.184640] virbr0: br_fill_info event 16 port vnet0 master virbr0
>> [   36.184645] virbr0: port 2(vnet0) event 16
>> [   36.184646] virbr0: br_fill_info event 16 port vnet0 master virbr0
>> [   37.805478] virbr0: hello timer expired
>> [   38.205413] virbr0: port 2(vnet0) forward delay timer
>> [   38.205414] virbr0: port 2(vnet0) entered learning state
>> [   38.205427] virbr0: port 2(vnet0) event 16
>> [   38.205430] virbr0: br_fill_info event 16 port vnet0 master virbr0
>> [   38.765414] virbr0: port 2(vnet0) hold timer expired
>> [   39.805415] virbr0: hello timer expired
>> [   40.285410] virbr0: port 2(vnet0) forward delay timer
>> [   40.285411] virbr0: port 2(vnet0) entered forwarding state
>> [   40.285418] virbr0: topology change detected, propagating
>> [   40.285420] virbr0: decreasing ageing time to 400
>> [   40.285427] virbr0: port 2(vnet0) event 16
>> [   40.285432] virbr0: br_fill_info event 16 port vnet0 master virbr0
>> [   40.765408] virbr0: port 2(vnet0) hold timer expired
>> [   41.805415] virbr0: hello timer expired
>> [   42.765426] virbr0: port 2(vnet0) hold timer expired
>> [   43.805425] virbr0: hello timer expired
>> [   44.765426] virbr0: port 2(vnet0) hold timer expired
>> [   45.805418] virbr0: hello timer expired
>>
>> and continuing....
> 
> Just reverting the umh.c parts like this makes the problem go away.
> 
> diff --git a/kernel/umh.c b/kernel/umh.c
> index f81e8698e36e..79f139a7ca03 100644
> --- a/kernel/umh.c
> +++ b/kernel/umh.c
> @@ -154,8 +154,8 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
>                  * the real error code is already in sub_info->retval or
>                  * sub_info->retval is 0 anyway, so don't mess with it then.
>                  */
> -               if (KWIFEXITED(ret))
> -                       sub_info->retval = KWEXITSTATUS(ret);
> +               if (ret)
> +                       sub_info->retval = ret;
>         }
>  
>         /* Restore default kernel sig handler */

I instrumented this:

[    5.118528] ret=0x100 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x1 
[    9.409235] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   10.114914] ret=0x100 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x1 
[   10.116308] ret=0x100 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x1 
[   10.117690] ret=0x100 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x1 
[   10.118732] ret=0x100 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x1 
[   10.119899] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   10.121365] ret=0x100 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x1 
[   10.128044] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   10.945814] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   10.962799] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   10.983755] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   10.992705] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.118877] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.124359] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.129364] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.142139] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.148385] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.153686] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.158861] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.164068] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.192445] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.200605] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.208493] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.216612] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.226467] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.525363] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.532758] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.535279] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.697660] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.701634] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.818652] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.836228] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   12.082266] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   40.965898] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 

So the translations look correct. But your change is actually a sematic change
if(ret) will only trigger if there is an error
if (KWIFEXITED(ret)) will always trigger when the process ends. So we will always overwrite -ECHILD
and we did not do it before.

Christian Borntraeger June 24, 2020, 6:37 p.m. UTC | #13

On 24.06.20 20:32, Christian Borntraeger wrote:
[...]> 
> So the translations look correct. But your change is actually a sematic change
> if(ret) will only trigger if there is an error
> if (KWIFEXITED(ret)) will always trigger when the process ends. So we will always overwrite -ECHILD
> and we did not do it before. 
> 

So the right fix is

diff --git a/kernel/umh.c b/kernel/umh.c
index f81e8698e36e..a3a3196e84d1 100644
--- a/kernel/umh.c
+++ b/kernel/umh.c
@@ -154,7 +154,7 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
                 * the real error code is already in sub_info->retval or
                 * sub_info->retval is 0 anyway, so don't mess with it then.
                 */
-               if (KWIFEXITED(ret))
+               if (KWEXITSTATUS(ret))
                        sub_info->retval = KWEXITSTATUS(ret);
        }
 


I think.

Christian Borntraeger June 25, 2020, 1:26 p.m. UTC | #14

On 24.06.20 20:37, Christian Borntraeger wrote:
> 
> 
> On 24.06.20 20:32, Christian Borntraeger wrote:
> [...]> 
>> So the translations look correct. But your change is actually a sematic change
>> if(ret) will only trigger if there is an error
>> if (KWIFEXITED(ret)) will always trigger when the process ends. So we will always overwrite -ECHILD
>> and we did not do it before. 
>>
> 
> So the right fix is
> 
> diff --git a/kernel/umh.c b/kernel/umh.c
> index f81e8698e36e..a3a3196e84d1 100644
> --- a/kernel/umh.c
> +++ b/kernel/umh.c
> @@ -154,7 +154,7 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
>                  * the real error code is already in sub_info->retval or
>                  * sub_info->retval is 0 anyway, so don't mess with it then.
>                  */
> -               if (KWIFEXITED(ret))
> +               if (KWEXITSTATUS(ret))
>                         sub_info->retval = KWEXITSTATUS(ret);
>         }

Ping. Shall I send this as a proper patch or are we merging this fixup in Andrews patch queue?

Luis Chamberlain June 26, 2020, 2:54 a.m. UTC | #15

On Wed, Jun 24, 2020 at 08:37:55PM +0200, Christian Borntraeger wrote:
> 
> 
> On 24.06.20 20:32, Christian Borntraeger wrote:
> [...]> 
> > So the translations look correct. But your change is actually a sematic change
> > if(ret) will only trigger if there is an error
> > if (KWIFEXITED(ret)) will always trigger when the process ends. So we will always overwrite -ECHILD
> > and we did not do it before. 
> > 
> 
> So the right fix is
> 
> diff --git a/kernel/umh.c b/kernel/umh.c
> index f81e8698e36e..a3a3196e84d1 100644
> --- a/kernel/umh.c
> +++ b/kernel/umh.c
> @@ -154,7 +154,7 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
>                  * the real error code is already in sub_info->retval or
>                  * sub_info->retval is 0 anyway, so don't mess with it then.
>                  */
> -               if (KWIFEXITED(ret))
> +               if (KWEXITSTATUS(ret))
>                         sub_info->retval = KWEXITSTATUS(ret);
>         }
>  
> I think.

Nope, the right form is to check for WIFEXITED() before using WEXITSTATUS().
I'm not able to reproduce this on x86 with a bridge. What type of bridge
are you using on a guest, or did you mean using KVM so that the *host*
can spawn kvm guests?

It would be good if you can try to add a bridge manually and see where
things fail. Can you do something like this:

brctl addbr br0
brctl addif br0 ens6 
ip link set dev br0 up

Note that most callers are for modprobe. I'd be curious to see which
umh is failing which breaks bridge for you. Can you trut this so we can
see which umh call is failing?

diff --git a/kernel/umh.c b/kernel/umh.c
index f81e8698e36e..5ad74bc301d8 100644
--- a/kernel/umh.c
+++ b/kernel/umh.c
@@ -2,6 +2,9 @@
 /*
  * umh - the kernel usermode helper
  */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include <linux/module.h>
 #include <linux/sched.h>
 #include <linux/sched/task.h>
@@ -154,8 +157,12 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
 		 * the real error code is already in sub_info->retval or
 		 * sub_info->retval is 0 anyway, so don't mess with it then.
 		 */
-		if (KWIFEXITED(ret))
+		printk("== ret: %02x\n", ret);
+		printk("== KWIFEXITED(ret): %02x\n", KWIFEXITED(ret));
+		if (KWIFEXITED(ret)) {
+			printk("KWEXITSTATUS(ret): %d\n", KWEXITSTATUS(ret));
 			sub_info->retval = KWEXITSTATUS(ret);
+		}
 	}
 
 	/* Restore default kernel sig handler */
@@ -383,6 +390,7 @@ struct subprocess_info *call_usermodehelper_setup(const char *path, char **argv,
 		void *data)
 {
 	struct subprocess_info *sub_info;
+	unsigned int i = 0;
 	sub_info = kzalloc(sizeof(struct subprocess_info), gfp_mask);
 	if (!sub_info)
 		goto out;
@@ -394,6 +402,11 @@ struct subprocess_info *call_usermodehelper_setup(const char *path, char **argv,
 #else
 	sub_info->path = path;
 #endif
+	pr_info("sub_info->path: %s\n", sub_info->path);
+	while (argv[i])
+		printk(KERN_INFO "%s ", argv[i++]);
+	printk(KERN_INFO  "\n");
+
 	sub_info->argv = argv;
 	sub_info->envp = envp;

Christian Borntraeger June 26, 2020, 5:22 a.m. UTC | #16

On 26.06.20 04:54, Luis Chamberlain wrote:
> On Wed, Jun 24, 2020 at 08:37:55PM +0200, Christian Borntraeger wrote:
>>
>>
>> On 24.06.20 20:32, Christian Borntraeger wrote:
>> [...]> 
>>> So the translations look correct. But your change is actually a sematic change
>>> if(ret) will only trigger if there is an error
>>> if (KWIFEXITED(ret)) will always trigger when the process ends. So we will always overwrite -ECHILD
>>> and we did not do it before. 
>>>
>>
>> So the right fix is
>>
>> diff --git a/kernel/umh.c b/kernel/umh.c
>> index f81e8698e36e..a3a3196e84d1 100644
>> --- a/kernel/umh.c
>> +++ b/kernel/umh.c
>> @@ -154,7 +154,7 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
>>                  * the real error code is already in sub_info->retval or
>>                  * sub_info->retval is 0 anyway, so don't mess with it then.
>>                  */
>> -               if (KWIFEXITED(ret))
>> +               if (KWEXITSTATUS(ret))
>>                         sub_info->retval = KWEXITSTATUS(ret);
>>         }
>>  
>> I think.
> 
> Nope, the right form is to check for WIFEXITED() before using WEXITSTATUS().

But this IS a change over the previous code, no?
I will test next week as I am travelling right now.

Christoph Hellwig June 26, 2020, 9 a.m. UTC | #17

On Fri, Jun 26, 2020 at 07:22:34AM +0200, Christian Borntraeger wrote:
> 
> 
> On 26.06.20 04:54, Luis Chamberlain wrote:
> > On Wed, Jun 24, 2020 at 08:37:55PM +0200, Christian Borntraeger wrote:
> >>
> >>
> >> On 24.06.20 20:32, Christian Borntraeger wrote:
> >> [...]> 
> >>> So the translations look correct. But your change is actually a sematic change
> >>> if(ret) will only trigger if there is an error
> >>> if (KWIFEXITED(ret)) will always trigger when the process ends. So we will always overwrite -ECHILD
> >>> and we did not do it before. 
> >>>
> >>
> >> So the right fix is
> >>
> >> diff --git a/kernel/umh.c b/kernel/umh.c
> >> index f81e8698e36e..a3a3196e84d1 100644
> >> --- a/kernel/umh.c
> >> +++ b/kernel/umh.c
> >> @@ -154,7 +154,7 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
> >>                  * the real error code is already in sub_info->retval or
> >>                  * sub_info->retval is 0 anyway, so don't mess with it then.
> >>                  */
> >> -               if (KWIFEXITED(ret))
> >> +               if (KWEXITSTATUS(ret))
> >>                         sub_info->retval = KWEXITSTATUS(ret);
> >>         }
> >>  
> >> I think.
> > 
> > Nope, the right form is to check for WIFEXITED() before using WEXITSTATUS().
> 
> But this IS a change over the previous code, no?
> I will test next week as I am travelling right now. 

I'm all for reverting back to the previous behavior.  If someone wants
a behavior change it should be a separate patch.  And out of pure self
interest I'd like to see that change after my addition of the
kernel_wait helper to replace the kernel_wait4 abuse :)

Luis Chamberlain June 26, 2020, 11:40 a.m. UTC | #18

On Fri, Jun 26, 2020 at 10:00:01AM +0100, Christoph Hellwig wrote:
> On Fri, Jun 26, 2020 at 07:22:34AM +0200, Christian Borntraeger wrote:
> > 
> > 
> > On 26.06.20 04:54, Luis Chamberlain wrote:
> > > On Wed, Jun 24, 2020 at 08:37:55PM +0200, Christian Borntraeger wrote:
> > >>
> > >>
> > >> On 24.06.20 20:32, Christian Borntraeger wrote:
> > >> [...]> 
> > >>> So the translations look correct. But your change is actually a sematic change
> > >>> if(ret) will only trigger if there is an error
> > >>> if (KWIFEXITED(ret)) will always trigger when the process ends. So we will always overwrite -ECHILD
> > >>> and we did not do it before. 
> > >>>
> > >>
> > >> So the right fix is
> > >>
> > >> diff --git a/kernel/umh.c b/kernel/umh.c
> > >> index f81e8698e36e..a3a3196e84d1 100644
> > >> --- a/kernel/umh.c
> > >> +++ b/kernel/umh.c
> > >> @@ -154,7 +154,7 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
> > >>                  * the real error code is already in sub_info->retval or
> > >>                  * sub_info->retval is 0 anyway, so don't mess with it then.
> > >>                  */
> > >> -               if (KWIFEXITED(ret))
> > >> +               if (KWEXITSTATUS(ret))
> > >>                         sub_info->retval = KWEXITSTATUS(ret);
> > >>         }
> > >>  
> > >> I think.
> > > 
> > > Nope, the right form is to check for WIFEXITED() before using WEXITSTATUS().
> > 
> > But this IS a change over the previous code, no?
> > I will test next week as I am travelling right now. 
> 
> I'm all for reverting back to the previous behavior.  If someone wants
> a behavior change it should be a separate patch.  And out of pure self
> interest I'd like to see that change after my addition of the
> kernel_wait helper to replace the kernel_wait4 abuse :)

Yeah sure, this will be *slightly* cleaner after that. Today we
implicitly return -ECHLD as well under the assumption the kernel_wait4()
call failed.

Andrew, can you please revert these two for now:

selftests: simplify kmod failure value
umh: fix processed error when UMH_WAIT_PROC is used

Later, we'll add Christoph's simplier kernel wait, and make the change
directly there to catpure the right error. That still won't fix this reported
issue, but it will be cleaner and will go tested by Christian Borntraeger
before.

  Luis

Luis Chamberlain June 26, 2020, 11:50 a.m. UTC | #19

On Fri, Jun 26, 2020 at 11:40:08AM +0000, Luis Chamberlain wrote:
> Andrew, can you please revert these two for now:
> 
> selftests: simplify kmod failure value
> umh: fix processed error when UMH_WAIT_PROC is used
> 
> Later, we'll add Christoph's simplier kernel wait, and make the change
> directly there to catpure the right error. That still won't fix this reported
> issue, but it will be cleaner and will go tested by Christian Borntraeger
> before.

However, note that the only consideration to make here against this
approach of the fix later going in with the simpler kernel wait is
if this actually is fixing a real issue, and if we'd want this going to
stable.

We should for sure revert though, so Andrew please do proceed to revert
or drop those two patches alone for now.

It was unclear to me if this should go to stable given I was not sure
how severe that NFS case mentioned on the commit log was, and no one on
the NFS side has replied about that yet, however there may be other
areas where code inspection of callsites was not sufficient to find the
real critical areas.

I'm now very curious what this issue that Christian with bridge on s390
found will be.

  Luis

Luis Chamberlain June 30, 2020, 5:57 p.m. UTC | #20

On Fri, Jun 26, 2020 at 02:54:10AM +0000, Luis Chamberlain wrote:
> On Wed, Jun 24, 2020 at 08:37:55PM +0200, Christian Borntraeger wrote:
> > 
> > 
> > On 24.06.20 20:32, Christian Borntraeger wrote:
> > [...]> 
> > > So the translations look correct. But your change is actually a sematic change
> > > if(ret) will only trigger if there is an error
> > > if (KWIFEXITED(ret)) will always trigger when the process ends. So we will always overwrite -ECHILD
> > > and we did not do it before. 
> > > 
> > 
> > So the right fix is
> > 
> > diff --git a/kernel/umh.c b/kernel/umh.c
> > index f81e8698e36e..a3a3196e84d1 100644
> > --- a/kernel/umh.c
> > +++ b/kernel/umh.c
> > @@ -154,7 +154,7 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
> >                  * the real error code is already in sub_info->retval or
> >                  * sub_info->retval is 0 anyway, so don't mess with it then.
> >                  */
> > -               if (KWIFEXITED(ret))
> > +               if (KWEXITSTATUS(ret))
> >                         sub_info->retval = KWEXITSTATUS(ret);
> >         }
> >  
> > I think.
> 
> Nope, the right form is to check for WIFEXITED() before using WEXITSTATUS().
> I'm not able to reproduce this on x86 with a bridge. What type of bridge
> are you using on a guest, or did you mean using KVM so that the *host*
> can spawn kvm guests?
> 
> It would be good if you can try to add a bridge manually and see where
> things fail. Can you do something like this:
> 
> brctl addbr br0
> brctl addif br0 ens6 
> ip link set dev br0 up
> 
> Note that most callers are for modprobe. I'd be curious to see which
> umh is failing which breaks bridge for you. Can you trut this so we can
> see which umh call is failing?

Christian, any luck getting to test the code below to see what this
reveals?

  Luis

> 
> diff --git a/kernel/umh.c b/kernel/umh.c
> index f81e8698e36e..5ad74bc301d8 100644
> --- a/kernel/umh.c
> +++ b/kernel/umh.c
> @@ -2,6 +2,9 @@
>  /*
>   * umh - the kernel usermode helper
>   */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
>  #include <linux/module.h>
>  #include <linux/sched.h>
>  #include <linux/sched/task.h>
> @@ -154,8 +157,12 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
>  		 * the real error code is already in sub_info->retval or
>  		 * sub_info->retval is 0 anyway, so don't mess with it then.
>  		 */
> -		if (KWIFEXITED(ret))
> +		printk("== ret: %02x\n", ret);
> +		printk("== KWIFEXITED(ret): %02x\n", KWIFEXITED(ret));
> +		if (KWIFEXITED(ret)) {
> +			printk("KWEXITSTATUS(ret): %d\n", KWEXITSTATUS(ret));
>  			sub_info->retval = KWEXITSTATUS(ret);
> +		}
>  	}
>  
>  	/* Restore default kernel sig handler */
> @@ -383,6 +390,7 @@ struct subprocess_info *call_usermodehelper_setup(const char *path, char **argv,
>  		void *data)
>  {
>  	struct subprocess_info *sub_info;
> +	unsigned int i = 0;
>  	sub_info = kzalloc(sizeof(struct subprocess_info), gfp_mask);
>  	if (!sub_info)
>  		goto out;
> @@ -394,6 +402,11 @@ struct subprocess_info *call_usermodehelper_setup(const char *path, char **argv,
>  #else
>  	sub_info->path = path;
>  #endif
> +	pr_info("sub_info->path: %s\n", sub_info->path);
> +	while (argv[i])
> +		printk(KERN_INFO "%s ", argv[i++]);
> +	printk(KERN_INFO  "\n");
> +
>  	sub_info->argv = argv;
>  	sub_info->envp = envp;
>  
>

Christian Borntraeger July 1, 2020, 10:08 a.m. UTC | #21

On 30.06.20 19:57, Luis Chamberlain wrote:
> On Fri, Jun 26, 2020 at 02:54:10AM +0000, Luis Chamberlain wrote:
>> On Wed, Jun 24, 2020 at 08:37:55PM +0200, Christian Borntraeger wrote:
>>>
>>>
>>> On 24.06.20 20:32, Christian Borntraeger wrote:
>>> [...]> 
>>>> So the translations look correct. But your change is actually a sematic change
>>>> if(ret) will only trigger if there is an error
>>>> if (KWIFEXITED(ret)) will always trigger when the process ends. So we will always overwrite -ECHILD
>>>> and we did not do it before. 
>>>>
>>>
>>> So the right fix is
>>>
>>> diff --git a/kernel/umh.c b/kernel/umh.c
>>> index f81e8698e36e..a3a3196e84d1 100644
>>> --- a/kernel/umh.c
>>> +++ b/kernel/umh.c
>>> @@ -154,7 +154,7 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
>>>                  * the real error code is already in sub_info->retval or
>>>                  * sub_info->retval is 0 anyway, so don't mess with it then.
>>>                  */
>>> -               if (KWIFEXITED(ret))
>>> +               if (KWEXITSTATUS(ret))
>>>                         sub_info->retval = KWEXITSTATUS(ret);
>>>         }
>>>  
>>> I think.
>>
>> Nope, the right form is to check for WIFEXITED() before using WEXITSTATUS().
>> I'm not able to reproduce this on x86 with a bridge. What type of bridge
>> are you using on a guest, or did you mean using KVM so that the *host*
>> can spawn kvm guests?
>>
>> It would be good if you can try to add a bridge manually and see where
>> things fail. Can you do something like this:
>>
>> brctl addbr br0
>> brctl addif br0 ens6 
>> ip link set dev br0 up
>>
>> Note that most callers are for modprobe. I'd be curious to see which
>> umh is failing which breaks bridge for you. Can you trut this so we can
>> see which umh call is failing?
> 
> Christian, any luck getting to test the code below to see what this
> reveals?
> 
>   Luis

dmesg attached
[    0.137762] Linux version 5.8.0-rc2-next-20200623-00002-g5a5b173022ac-dirty (cborntra@m83lp52.lnxne.boe) (gcc (GCC) 9.3.1 20200317 (Red Hat 9.3.1-1), GNU ld version 2.32-31.fc31) #60 SMP Wed Jul 1 09:14:56 CEST 2020
[    0.137764] setup: Linux is running natively in 64-bit mode
[    0.137812] setup: The maximum memory size is 131072MB
[    0.137817] setup: Reserving 1024MB of memory at 130048MB for crashkernel (System RAM: 130048MB)
[    0.137901] cpu: 64 configured CPUs, 0 standby CPUs
[    0.137966] cpu: The CPU configuration topology of the machine is: 0 0 4 2 3 10 / 4
[    0.138533] Write protected kernel read-only data: 13660k
[    0.139239] Zone ranges:
[    0.139240]   DMA      [mem 0x0000000000000000-0x000000007fffffff]
[    0.139241]   Normal   [mem 0x0000000080000000-0x0000001fffffffff]
[    0.139243] Movable zone start for each node
[    0.139243] Early memory node ranges
[    0.139244]   node   0: [mem 0x0000000000000000-0x0000001fffffffff]
[    0.139251] Initmem setup node 0 [mem 0x0000000000000000-0x0000001fffffffff]
[    0.139252] On node 0 totalpages: 33554432
[    0.139253]   DMA zone: 8192 pages used for memmap
[    0.139253]   DMA zone: 0 pages reserved
[    0.139254]   DMA zone: 524288 pages, LIFO batch:63
[    0.152116]   Normal zone: 516096 pages used for memmap
[    0.152117]   Normal zone: 33030144 pages, LIFO batch:63
[    0.168380] percpu: Embedded 33 pages/cpu s97536 r8192 d29440 u135168
[    0.168388] pcpu-alloc: s97536 r8192 d29440 u135168 alloc=33*4096
[    0.168388] pcpu-alloc: [0] 000 [0] 001 [0] 002 [0] 003 
[    0.168390] pcpu-alloc: [0] 004 [0] 005 [0] 006 [0] 007 
[    0.168391] pcpu-alloc: [0] 008 [0] 009 [0] 010 [0] 011 
[    0.168392] pcpu-alloc: [0] 012 [0] 013 [0] 014 [0] 015 
[    0.168393] pcpu-alloc: [0] 016 [0] 017 [0] 018 [0] 019 
[    0.168394] pcpu-alloc: [0] 020 [0] 021 [0] 022 [0] 023 
[    0.168395] pcpu-alloc: [0] 024 [0] 025 [0] 026 [0] 027 
[    0.168396] pcpu-alloc: [0] 028 [0] 029 [0] 030 [0] 031 
[    0.168397] pcpu-alloc: [0] 032 [0] 033 [0] 034 [0] 035 
[    0.168398] pcpu-alloc: [0] 036 [0] 037 [0] 038 [0] 039 
[    0.168399] pcpu-alloc: [0] 040 [0] 041 [0] 042 [0] 043 
[    0.168400] pcpu-alloc: [0] 044 [0] 045 [0] 046 [0] 047 
[    0.168401] pcpu-alloc: [0] 048 [0] 049 [0] 050 [0] 051 
[    0.168403] pcpu-alloc: [0] 052 [0] 053 [0] 054 [0] 055 
[    0.168404] pcpu-alloc: [0] 056 [0] 057 [0] 058 [0] 059 
[    0.168405] pcpu-alloc: [0] 060 [0] 061 [0] 062 [0] 063 
[    0.168406] pcpu-alloc: [0] 064 [0] 065 [0] 066 [0] 067 
[    0.168407] pcpu-alloc: [0] 068 [0] 069 [0] 070 [0] 071 
[    0.168408] pcpu-alloc: [0] 072 [0] 073 [0] 074 [0] 075 
[    0.168409] pcpu-alloc: [0] 076 [0] 077 [0] 078 [0] 079 
[    0.168410] pcpu-alloc: [0] 080 [0] 081 [0] 082 [0] 083 
[    0.168411] pcpu-alloc: [0] 084 [0] 085 [0] 086 [0] 087 
[    0.168413] pcpu-alloc: [0] 088 [0] 089 [0] 090 [0] 091 
[    0.168414] pcpu-alloc: [0] 092 [0] 093 [0] 094 [0] 095 
[    0.168415] pcpu-alloc: [0] 096 [0] 097 [0] 098 [0] 099 
[    0.168416] pcpu-alloc: [0] 100 [0] 101 [0] 102 [0] 103 
[    0.168417] pcpu-alloc: [0] 104 [0] 105 [0] 106 [0] 107 
[    0.168418] pcpu-alloc: [0] 108 [0] 109 [0] 110 [0] 111 
[    0.168419] pcpu-alloc: [0] 112 [0] 113 [0] 114 [0] 115 
[    0.168420] pcpu-alloc: [0] 116 [0] 117 [0] 118 [0] 119 
[    0.168421] pcpu-alloc: [0] 120 [0] 121 [0] 122 [0] 123 
[    0.168422] pcpu-alloc: [0] 124 [0] 125 [0] 126 [0] 127 
[    0.168424] pcpu-alloc: [0] 128 [0] 129 [0] 130 [0] 131 
[    0.168425] pcpu-alloc: [0] 132 [0] 133 [0] 134 [0] 135 
[    0.168426] pcpu-alloc: [0] 136 [0] 137 [0] 138 [0] 139 
[    0.168427] pcpu-alloc: [0] 140 [0] 141 [0] 142 [0] 143 
[    0.168428] pcpu-alloc: [0] 144 [0] 145 [0] 146 [0] 147 
[    0.168429] pcpu-alloc: [0] 148 [0] 149 [0] 150 [0] 151 
[    0.168430] pcpu-alloc: [0] 152 [0] 153 [0] 154 [0] 155 
[    0.168431] pcpu-alloc: [0] 156 [0] 157 [0] 158 [0] 159 
[    0.168432] pcpu-alloc: [0] 160 [0] 161 [0] 162 [0] 163 
[    0.168433] pcpu-alloc: [0] 164 [0] 165 [0] 166 [0] 167 
[    0.168434] pcpu-alloc: [0] 168 [0] 169 [0] 170 [0] 171 
[    0.168435] pcpu-alloc: [0] 172 [0] 173 [0] 174 [0] 175 
[    0.168436] pcpu-alloc: [0] 176 [0] 177 [0] 178 [0] 179 
[    0.168437] pcpu-alloc: [0] 180 [0] 181 [0] 182 [0] 183 
[    0.168438] pcpu-alloc: [0] 184 [0] 185 [0] 186 [0] 187 
[    0.168439] pcpu-alloc: [0] 188 [0] 189 [0] 190 [0] 191 
[    0.168440] pcpu-alloc: [0] 192 [0] 193 [0] 194 [0] 195 
[    0.168442] pcpu-alloc: [0] 196 [0] 197 [0] 198 [0] 199 
[    0.168443] pcpu-alloc: [0] 200 [0] 201 [0] 202 [0] 203 
[    0.168444] pcpu-alloc: [0] 204 [0] 205 [0] 206 [0] 207 
[    0.168445] pcpu-alloc: [0] 208 [0] 209 [0] 210 [0] 211 
[    0.168446] pcpu-alloc: [0] 212 [0] 213 [0] 214 [0] 215 
[    0.168447] pcpu-alloc: [0] 216 [0] 217 [0] 218 [0] 219 
[    0.168448] pcpu-alloc: [0] 220 [0] 221 [0] 222 [0] 223 
[    0.168449] pcpu-alloc: [0] 224 [0] 225 [0] 226 [0] 227 
[    0.168450] pcpu-alloc: [0] 228 [0] 229 [0] 230 [0] 231 
[    0.168451] pcpu-alloc: [0] 232 [0] 233 [0] 234 [0] 235 
[    0.168452] pcpu-alloc: [0] 236 [0] 237 [0] 238 [0] 239 
[    0.168453] pcpu-alloc: [0] 240 [0] 241 [0] 242 [0] 243 
[    0.168454] pcpu-alloc: [0] 244 [0] 245 [0] 246 [0] 247 
[    0.168455] pcpu-alloc: [0] 248 [0] 249 [0] 250 [0] 251 
[    0.168456] pcpu-alloc: [0] 252 [0] 253 [0] 254 [0] 255 
[    0.168457] pcpu-alloc: [0] 256 [0] 257 [0] 258 [0] 259 
[    0.168459] pcpu-alloc: [0] 260 [0] 261 [0] 262 [0] 263 
[    0.168460] pcpu-alloc: [0] 264 [0] 265 [0] 266 [0] 267 
[    0.168461] pcpu-alloc: [0] 268 [0] 269 [0] 270 [0] 271 
[    0.168462] pcpu-alloc: [0] 272 [0] 273 [0] 274 [0] 275 
[    0.168463] pcpu-alloc: [0] 276 [0] 277 [0] 278 [0] 279 
[    0.168464] pcpu-alloc: [0] 280 [0] 281 [0] 282 [0] 283 
[    0.168465] pcpu-alloc: [0] 284 [0] 285 [0] 286 [0] 287 
[    0.168466] pcpu-alloc: [0] 288 [0] 289 [0] 290 [0] 291 
[    0.168467] pcpu-alloc: [0] 292 [0] 293 [0] 294 [0] 295 
[    0.168468] pcpu-alloc: [0] 296 [0] 297 [0] 298 [0] 299 
[    0.168469] pcpu-alloc: [0] 300 [0] 301 [0] 302 [0] 303 
[    0.168470] pcpu-alloc: [0] 304 [0] 305 [0] 306 [0] 307 
[    0.168471] pcpu-alloc: [0] 308 [0] 309 [0] 310 [0] 311 
[    0.168472] pcpu-alloc: [0] 312 [0] 313 [0] 314 [0] 315 
[    0.168473] pcpu-alloc: [0] 316 [0] 317 [0] 318 [0] 319 
[    0.168474] pcpu-alloc: [0] 320 [0] 321 [0] 322 [0] 323 
[    0.168475] pcpu-alloc: [0] 324 [0] 325 [0] 326 [0] 327 
[    0.168476] pcpu-alloc: [0] 328 [0] 329 [0] 330 [0] 331 
[    0.168477] pcpu-alloc: [0] 332 [0] 333 [0] 334 [0] 335 
[    0.168478] pcpu-alloc: [0] 336 [0] 337 [0] 338 [0] 339 
[    0.168501] Built 1 zonelists, mobility grouping on.  Total pages: 33030144
[    0.168502] Policy zone: Normal
[    0.168503] Kernel command line: root=/dev/disk/by-path/ccw-0.0.3318-part1 rd.dasd=0.0.3318 cio_ignore=all,!condev rd.znet=qeth,0.0.bd00,0.0.bd01,0.0.bd02,layer2=1,portno=0,portname=OSAPORT zfcp.allow_lun_scan=0 BOOT_IMAGE=0 crashkernel=1G brd.rd_size=8000000 brd.rd_nr=8 BOOT_IMAGE=0
[    0.169679] printk: log_buf_len individual max cpu contribution: 4096 bytes
[    0.169680] printk: log_buf_len total cpu_extra contributions: 1388544 bytes
[    0.169680] printk: log_buf_len min size: 131072 bytes
[    0.169973] printk: log_buf_len: 2097152 bytes
[    0.169974] printk: early log buf free: 123812(94%)
[    0.179095] Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes, linear)
[    0.183664] Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes, linear)
[    0.183681] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.212977] Memory: 2312420K/134217728K available (10512K kernel code, 2084K rwdata, 3148K rodata, 4008K init, 856K bss, 3358660K reserved, 0K cma-reserved)
[    0.213438] SLUB: HWalign=256, Order=0-3, MinObjects=0, CPUs=340, Nodes=1
[    0.213476] ftrace: allocating 31392 entries in 123 pages
[    0.218045] ftrace: allocated 123 pages with 6 groups
[    0.219167] rcu: Hierarchical RCU implementation.
[    0.219167] rcu: 	RCU event tracing is enabled.
[    0.219168] rcu: 	RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=340.
[    0.219169] 	Trampoline variant of Tasks RCU enabled.
[    0.219169] 	Rude variant of Tasks RCU enabled.
[    0.219169] 	Tracing variant of Tasks RCU enabled.
[    0.219170] rcu: RCU calculated value of scheduler-enlistment delay is 11 jiffies.
[    0.219171] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=340
[    0.222162] NR_IRQS: 3, nr_irqs: 3, preallocated irqs: 3
[    0.222313] clocksource: tod: mask: 0xffffffffffffffff max_cycles: 0x3b0a9be803b0a9, max_idle_ns: 1805497147909793 ns
[    0.222534] Console: colour dummy device 80x25
[    0.323253] printk: console [ttyS0] enabled
[    0.457563] Calibrating delay loop (skipped)... 21881.00 BogoMIPS preset
[    0.457564] pid_max: default: 348160 minimum: 2720
[    0.457736] LSM: Security Framework initializing
[    0.457767] SELinux:  Initializing.
[    0.458030] Mount-cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
[    0.458174] Mountpoint-cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
[    0.459174] rcu: Hierarchical SRCU implementation.
[    0.462173] smp: Bringing up secondary CPUs ...
[    0.485719] smp: Brought up 1 node, 64 CPUs
[    1.072386] random: fast init done
[    1.413328] node 0 deferred pages initialised in 930ms
[    1.442064] devtmpfs: initialized
[    1.442586] random: get_random_u32 called from bucket_table_alloc.isra.0+0x82/0x120 with crng_init=1
[    1.442863] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[    1.443545] futex hash table entries: 131072 (order: 13, 33554432 bytes, vmalloc)
[    1.447717] xor: automatically using best checksumming function   xc        
[    1.447917] NET: Registered protocol family 16
[    1.447956] audit: initializing netlink subsys (disabled)
[    1.448023] audit: type=2000 audit(1593597944.331:1): state=initialized audit_enabled=0 res=1
[    1.448147] Spectre V2 mitigation: etokens
[    1.456276] HugeTLB registered 1.00 MiB page size, pre-allocated 0 pages
[    1.702379] raid6: vx128x8  gen() 21910 MB/s
[    1.872319] raid6: vx128x8  xor() 13356 MB/s
[    1.872320] raid6: using algorithm vx128x8 gen() 21910 MB/s
[    1.872321] raid6: .... xor() 13356 MB/s, rmw enabled
[    1.872322] raid6: using s390xc recovery algorithm
[    1.872601] iommu: Default domain type: Translated 
[    1.872748] SCSI subsystem initialized
[    1.894528] PCI host bridge to bus 0000:00
[    1.894533] pci_bus 0000:00: root bus resource [bus 00]
[    1.894535] pci_bus 0000:00: root bus resource [mem 0x8000000000000000-0x8000000007ffffff 64bit pref]
[    1.894600] pci 0000:00:00.0: [1014:044b] type 00 class 0x120000
[    1.894650] pci 0000:00:00.0: reg 0x10: [mem 0xffffd80008000000-0xffffd8000fffffff 64bit pref]
[    1.894918] pci 0000:00:00.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown x0 link at 0000:00:00.0 (capable of 32.000 Gb/s with 5.0 GT/s PCIe x8 link)
[    1.894951] pci 0000:00:00.0: Adding to iommu group 0
[    1.896880] PCI host bridge to bus 0001:00
[    1.896882] pci_bus 0001:00: root bus resource [bus 00]
[    1.896883] pci_bus 0001:00: root bus resource [mem 0x8001000000000000-0x80010000000fffff 64bit pref]
[    1.896971] pci 0001:00:00.0: [15b3:1016] type 00 class 0x020000
[    1.897066] pci 0001:00:00.0: reg 0x10: [mem 0xffffd40002000000-0xffffd400020fffff 64bit pref]
[    1.897216] pci 0001:00:00.0: enabling Extended Tags
[    1.897702] pci 0001:00:00.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown x0 link at 0001:00:00.0 (capable of 63.008 Gb/s with 8.0 GT/s PCIe x8 link)
[    1.897740] pci 0001:00:00.0: Adding to iommu group 1
[    1.899688] PCI host bridge to bus 0002:00
[    1.899690] pci_bus 0002:00: root bus resource [bus 00]
[    1.899692] pci_bus 0002:00: root bus resource [mem 0x8002000000000000-0x80020000000fffff 64bit pref]
[    1.899779] pci 0002:00:00.0: [15b3:1016] type 00 class 0x020000
[    1.899877] pci 0002:00:00.0: reg 0x10: [mem 0xffffd40008000000-0xffffd400080fffff 64bit pref]
[    1.900031] pci 0002:00:00.0: enabling Extended Tags
[    1.900530] pci 0002:00:00.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown x0 link at 0002:00:00.0 (capable of 63.008 Gb/s with 8.0 GT/s PCIe x8 link)
[    1.900563] pci 0002:00:00.0: Adding to iommu group 2
[    2.424592] VFS: Disk quotas dquot_6.6.0
[    2.424648] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    2.425726] random: crng init done
[    2.426438] NET: Registered protocol family 2
[    2.427232] tcp_listen_portaddr_hash hash table entries: 65536 (order: 8, 1048576 bytes, linear)
[    2.427784] TCP established hash table entries: 524288 (order: 10, 4194304 bytes, vmalloc)
[    2.429806] TCP bind hash table entries: 65536 (order: 8, 1048576 bytes, linear)
[    2.430265] TCP: Hash tables configured (established 524288 bind 65536)
[    2.430641] UDP hash table entries: 65536 (order: 9, 2097152 bytes, vmalloc)
[    2.431636] UDP-Lite hash table entries: 65536 (order: 9, 2097152 bytes, vmalloc)
[    2.433139] NET: Registered protocol family 1
[    2.433180] Trying to unpack rootfs image as initramfs...
[    3.042521] Freeing initrd memory: 46272K
[    3.042920] alg: No test for crc32be (crc32be-vx)
[    3.045455] Initialise system trusted keyrings
[    3.045514] workingset: timestamp_bits=45 max_order=25 bucket_order=0
[    3.046303] zbud: loaded
[    3.046685] fuse: init (API version 7.31)
[    3.046739] SGI XFS with ACLs, security attributes, realtime, quota, no debug enabled
[    3.049940] Key type asymmetric registered
[    3.049942] Asymmetric key parser 'x509' registered
[    3.049948] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 251)
[    3.050150] io scheduler mq-deadline registered
[    3.050152] io scheduler kyber registered
[    3.050171] io scheduler bfq registered
[    3.051187] atomic64_test: passed
[    3.051236] hvc_iucv: The z/VM IUCV HVC device driver cannot be used without z/VM
[    3.055982] brd: module loaded
[    3.056482] NET: Registered protocol family 10
[    3.057354] Segment Routing with IPv6
[    3.057366] NET: Registered protocol family 17
[    3.057370] Key type dns_resolver registered
[    3.057524] cio: Channel measurement facility initialized using format extended (mode autodetected)
[    3.057802] Discipline DIAG cannot be used without z/VM
[    5.446198] sclp_sd: No data is available for the config data entity
[    5.954014] qeth: loading core functions
[    5.954063] qeth: register layer 2 discipline
[    5.954064] qeth: register layer 3 discipline
[    5.954126] registered taskstats version 1
[    5.954131] Loading compiled-in X.509 certificates
[    5.993277] Loaded X.509 cert 'Build time autogenerated kernel key: c46ba92ee388c82c5891ee836c9c20b752cdfac5'
[    5.995428] zswap: loaded using pool lzo/zbud
[    5.996258] Key type ._fscrypt registered
[    5.996258] Key type .fscrypt registered
[    5.996259] Key type fscrypt-provisioning registered
[    5.996570] Btrfs loaded, crc32c=crc32c-vx
[    5.996583] ima: No TPM chip found, activating TPM-bypass!
[    5.996586] ima: Allocated hash algorithm: sha256
[    5.996596] ima: No architecture policies found
[    5.997561] Freeing unused kernel memory: 4008K
[    6.042376] Write protected read-only-after-init data: 80k
[    6.042381] Run /init as init process
[    6.042383]   with arguments:
[    6.042383]     /init
[    6.042383]   with environment:
[    6.042384]     HOME=/
[    6.042384]     TERM=linux
[    6.042384]     BOOT_IMAGE=0
[    6.042385]     crashkernel=1G
[    6.057157] systemd[1]: Inserted module 'autofs4'
[    6.058276] systemd[1]: systemd v243.8-1.fc31 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=unified)
[    6.059083] systemd[1]: Detected architecture s390x.
[    6.059085] systemd[1]: Running in initial RAM disk.
[    6.059147] systemd[1]: Set hostname to <m83lp52.lnxne.boe>.
[    6.098479] systemd[1]: Reached target Local File Systems.
[    6.098536] systemd[1]: Reached target Slices.
[    6.098555] systemd[1]: Reached target Swap.
[    6.098570] systemd[1]: Reached target Timers.
[    6.098663] systemd[1]: Listening on Journal Audit Socket.
[    6.098712] systemd[1]: Listening on Journal Socket (/dev/log).
[    6.145682] umh: sub_info->path: /sbin/modprobe
[    6.145683] /sbin/modprobe 
[    6.145684] -q 
[    6.145685] -- 
[    6.145685] sch_fq_codel 

[    6.146408] == ret: 100
[    6.146410] == KWIFEXITED(ret): 01
[    6.146411] KWEXITSTATUS(ret): 1
[    6.365672] audit: type=1130 audit(1593597949.251:2): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[    6.372997] audit: type=1130 audit(1593597949.261:3): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-tmpfiles-setup comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[    7.294273] audit: type=1130 audit(1593597950.181:4): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=dracut-cmdline comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[    7.310777] audit: type=1130 audit(1593597950.191:5): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=dracut-pre-udev comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[    7.311257] audit: type=1334 audit(1593597950.191:6): prog-id=6 op=LOAD
[    7.311287] audit: type=1334 audit(1593597950.191:7): prog-id=7 op=LOAD
[    7.533584] audit: type=1130 audit(1593597950.421:8): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-udevd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[    7.538462] qeth 0.0.bd00: portname is deprecated and is ignored
[    7.539958] dasd-eckd 0.0.3318: A channel path to the device has become operational
[    7.540242] dasd-eckd 0.0.3319: A channel path to the device has become operational
[    7.540743] dasd-eckd 0.0.331a: A channel path to the device has become operational
[    7.540985] dasd-eckd 0.0.331b: A channel path to the device has become operational
[    7.543658] qdio: 0.0.bd02 OSA on SC 159b using AI:1 QEBSM:0 PRI:1 TDD:1 SIGA: W AP
[    7.550125] dasd-eckd 0.0.3319: New DASD 3390/0E (CU 3990/01) with 262668 cylinders, 15 heads, 224 sectors
[    7.552945] dasd-eckd 0.0.3319: DASD with 4 KB/block, 189120960 KB total size, 48 KB/track, compatible disk layout
[    7.554058]  dasdb:VOL1/  0X3319: dasdb1
[    7.554603] dasd-eckd 0.0.3318: New DASD 3390/0E (CU 3990/01) with 262668 cylinders, 15 heads, 224 sectors
[    7.557378] dasd-eckd 0.0.3318: DASD with 4 KB/block, 189120960 KB total size, 48 KB/track, compatible disk layout
[    7.558349]  dasda:VOL1/  0X3318: dasda1
[    7.558949] dasd-eckd 0.0.331a: New DASD 3390/0C (CU 3990/01) with 30051 cylinders, 15 heads, 224 sectors
[    7.561612] dasd-eckd 0.0.331a: DASD with 4 KB/block, 21636720 KB total size, 48 KB/track, compatible disk layout
[    7.563390] dasd-eckd 0.0.331b: New DASD 3390/0C (CU 3990/01) with 30051 cylinders, 15 heads, 224 sectors
[    7.563427]  dasdc:VOL1/  0X3333: dasdc1 dasdc2 dasdc3
[    7.566603] dasd-eckd 0.0.331b: DASD with 4 KB/block, 21636720 KB total size, 48 KB/track, compatible disk layout
[    7.567763]  dasdd:VOL1/  0X3333: dasdd1 dasdd2 dasdd3
[    7.575390] qeth 0.0.bd00: QDIO data connection isolation is deactivated
[    7.575863] qeth 0.0.bd00: The device represents a Bridge Capable Port
[    7.576279] audit: type=1130 audit(1593597950.461:9): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-udev-trigger comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[    7.579230] qeth 0.0.bd00: MAC address b6:c4:0f:21:ab:4e successfully registered
[    7.579749] qeth 0.0.bd00: Device is a OSD Express card (level: 0202)
               with link type OSD_10GIG.
[    7.590981] audit: type=1130 audit(1593597950.471:10): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=plymouth-start comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[    7.621333] qeth 0.0.bd00: MAC address de:45:d7:61:c4:13 successfully registered
[    7.623582] qeth 0.0.bd00 encbd00: renamed from eth0
[    7.812522] mlx5_core 0001:00:00.0: enabling device (0000 -> 0002)
[    7.812608] mlx5_core 0001:00:00.0: firmware version: 14.23.1020
[    8.035163] dasdconf.sh Warning: 0.0.3319 is already online, not configuring
[    8.037162] dasdconf.sh Warning: 0.0.331b is already online, not configuring
[    8.037885] dasdconf.sh Warning: 0.0.3318 is already online, not configuring
[    8.065688] dasdconf.sh Warning: 0.0.331a is already online, not configuring
[    8.254855] umh: sub_info->path: /sbin/modprobe
[    8.254857] /sbin/modprobe 
[    8.254858] -q 
[    8.254858] -- 
[    8.254859] mlx5_ib 

[    8.255376] mlx5_core 0002:00:00.0: enabling device (0000 -> 0002)
[    8.255460] mlx5_core 0002:00:00.0: firmware version: 14.23.1020
[    8.706511] umh: sub_info->path: /sbin/modprobe
[    8.706542] /sbin/modprobe 
[    8.706546] -q 
[    8.706555] -- 
[    8.706563] mlx5_ib 

[    8.708963] mlx5_core 0001:00:00.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0)
[    8.853364] mlx5_core 0002:00:00.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0)
[    9.023031] mlx5_core 0002:00:00.0 enP2s564np0: renamed from eth1
[    9.223067] mlx5_core 0001:00:00.0 enP1s519np0: renamed from eth0
[    9.553829] audit: type=1130 audit(1593597952.441:11): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=dracut-initqueue comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[    9.570174] audit: type=1130 audit(1593597952.451:12): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-fsck-root comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[    9.578552] EXT4-fs (dasda1): mounted filesystem with ordered data mode. Opts: (null)
[    9.586599] audit: type=1334 audit(1593597952.471:13): prog-id=5 op=UNLOAD
[    9.937200] systemd-journald[484]: Received SIGTERM from PID 1 (systemd).
[    9.954691] printk: systemd: 19 output lines suppressed due to ratelimiting
[   10.278153] SELinux:  Permission watch in class filesystem not defined in policy.
[   10.278160] SELinux:  Permission watch in class file not defined in policy.
[   10.278161] SELinux:  Permission watch_mount in class file not defined in policy.
[   10.278162] SELinux:  Permission watch_sb in class file not defined in policy.
[   10.278163] SELinux:  Permission watch_with_perm in class file not defined in policy.
[   10.278164] SELinux:  Permission watch_reads in class file not defined in policy.
[   10.278167] SELinux:  Permission watch in class dir not defined in policy.
[   10.278168] SELinux:  Permission watch_mount in class dir not defined in policy.
[   10.278169] SELinux:  Permission watch_sb in class dir not defined in policy.
[   10.278170] SELinux:  Permission watch_with_perm in class dir not defined in policy.
[   10.278171] SELinux:  Permission watch_reads in class dir not defined in policy.
[   10.278174] SELinux:  Permission watch in class lnk_file not defined in policy.
[   10.278175] SELinux:  Permission watch_mount in class lnk_file not defined in policy.
[   10.278176] SELinux:  Permission watch_sb in class lnk_file not defined in policy.
[   10.278177] SELinux:  Permission watch_with_perm in class lnk_file not defined in policy.
[   10.278178] SELinux:  Permission watch_reads in class lnk_file not defined in policy.
[   10.278180] SELinux:  Permission watch in class chr_file not defined in policy.
[   10.278198] SELinux:  Permission watch_mount in class chr_file not defined in policy.
[   10.278199] SELinux:  Permission watch_sb in class chr_file not defined in policy.
[   10.278200] SELinux:  Permission watch_with_perm in class chr_file not defined in policy.
[   10.278201] SELinux:  Permission watch_reads in class chr_file not defined in policy.
[   10.278203] SELinux:  Permission watch in class blk_file not defined in policy.
[   10.278204] SELinux:  Permission watch_mount in class blk_file not defined in policy.
[   10.278205] SELinux:  Permission watch_sb in class blk_file not defined in policy.
[   10.278206] SELinux:  Permission watch_with_perm in class blk_file not defined in policy.
[   10.278207] SELinux:  Permission watch_reads in class blk_file not defined in policy.
[   10.278209] SELinux:  Permission watch in class sock_file not defined in policy.
[   10.278211] SELinux:  Permission watch_mount in class sock_file not defined in policy.
[   10.278212] SELinux:  Permission watch_sb in class sock_file not defined in policy.
[   10.278212] SELinux:  Permission watch_with_perm in class sock_file not defined in policy.
[   10.278213] SELinux:  Permission watch_reads in class sock_file not defined in policy.
[   10.278216] SELinux:  Permission watch in class fifo_file not defined in policy.
[   10.278217] SELinux:  Permission watch_mount in class fifo_file not defined in policy.
[   10.278218] SELinux:  Permission watch_sb in class fifo_file not defined in policy.
[   10.278219] SELinux:  Permission watch_with_perm in class fifo_file not defined in policy.
[   10.278220] SELinux:  Permission watch_reads in class fifo_file not defined in policy.
[   10.278262] SELinux:  Permission perfmon in class capability2 not defined in policy.
[   10.278263] SELinux:  Permission bpf in class capability2 not defined in policy.
[   10.278268] SELinux:  Permission perfmon in class cap2_userns not defined in policy.
[   10.278269] SELinux:  Permission bpf in class cap2_userns not defined in policy.
[   10.278304] SELinux:  Class perf_event not defined in policy.
[   10.278305] SELinux:  Class lockdown not defined in policy.
[   10.278306] SELinux: the above unknown classes and permissions will be allowed
[   10.278311] SELinux:  policy capability network_peer_controls=1
[   10.278311] SELinux:  policy capability open_perms=1
[   10.278312] SELinux:  policy capability extended_socket_class=1
[   10.278313] SELinux:  policy capability always_check_network=0
[   10.278314] SELinux:  policy capability cgroup_seclabel=1
[   10.278314] SELinux:  policy capability nnp_nosuid_transition=1
[   10.278315] SELinux:  policy capability genfs_seclabel_symlinks=0
[   10.389095] systemd[1]: Successfully loaded SELinux policy in 339.175ms.
[   10.534855] systemd[1]: Relabelled /dev, /dev/shm, /run, /sys/fs/cgroup in 13.169ms.
[   10.537104] systemd[1]: systemd v243.8-1.fc31 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=unified)
[   10.537930] systemd[1]: Detected architecture s390x.
[   10.540018] systemd[1]: Set hostname to <m83lp52.lnxne.boe>.
[   10.716298] systemd[1]: /usr/lib/systemd/system/sssd.service:12: PIDFile= references a path below legacy directory /var/run/, updating /var/run/sssd.pid → /run/sssd.pid; please update the unit file accordingly.
[   10.721240] systemd[1]: /usr/lib/systemd/system/iscsid.service:11: PIDFile= references a path below legacy directory /var/run/, updating /var/run/iscsid.pid → /run/iscsid.pid; please update the unit file accordingly.
[   10.721443] systemd[1]: /usr/lib/systemd/system/iscsiuio.service:13: PIDFile= references a path below legacy directory /var/run/, updating /var/run/iscsiuio.pid → /run/iscsiuio.pid; please update the unit file accordingly.
[   10.748479] systemd[1]: /usr/lib/systemd/system/sssd-kcm.socket:7: ListenStream= references a path below legacy directory /var/run/, updating /var/run/.heim_org.h5l.kcm-socket → /run/.heim_org.h5l.kcm-socket; please update the unit file accordingly.
[   10.781859] systemd[1]: initrd-switch-root.service: Succeeded.
[   10.781985] systemd[1]: Stopped Switch Root.
[   10.782251] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 1.
[   10.810779] EXT4-fs (dasda1): re-mounted. Opts: (null)
[   10.841271] umh: sub_info->path: /sbin/modprobe
[   10.841275] /sbin/modprobe 
[   10.841275] -q 
[   10.841276] -- 
[   10.841277] sch_fq_codel 

[   10.889454] == ret: 00
[   10.889456] == KWIFEXITED(ret): 01
[   10.889457] KWEXITSTATUS(ret): 0
[   11.118962] systemd-journald[990]: Received client request to flush runtime journal.
[   11.392641] VFIO - User Level meta-driver version: 0.3
[   11.490782] genwqe 0000:00:00.0: enabling device (0000 -> 0002)
[   11.506302] dasdconf.sh Warning: 0.0.3319 is already online, not configuring
[   11.507431] dasdconf.sh Warning: 0.0.331a is already online, not configuring
[   11.507992] dasdconf.sh Warning: 0.0.331b is already online, not configuring
[   11.508762] dasdconf.sh Warning: 0.0.3318 is already online, not configuring
[   11.625686] umh: sub_info->path: /sbin/modprobe
[   11.625690] /sbin/modprobe 
[   11.625690] -q 
[   11.625691] -- 
[   11.625692] crypto-ecb(aes) 

[   11.627270] == ret: 100
[   11.627272] == KWIFEXITED(ret): 01
[   11.627274] KWEXITSTATUS(ret): 1
[   11.627678] umh: sub_info->path: /sbin/modprobe
[   11.627681] /sbin/modprobe 
[   11.627682] -q 
[   11.627682] -- 
[   11.627683] crypto-cbc(aes) 

[   11.628695] == ret: 100
[   11.628699] == KWIFEXITED(ret): 01
[   11.628699] KWEXITSTATUS(ret): 1
[   11.629108] umh: sub_info->path: /sbin/modprobe
[   11.629110] /sbin/modprobe 
[   11.629110] -q 
[   11.629111] -- 
[   11.629112] crypto-xts(aes) 

[   11.630032] == ret: 100
[   11.630034] == KWIFEXITED(ret): 01
[   11.630035] KWEXITSTATUS(ret): 1
[   11.630098] umh: sub_info->path: /sbin/modprobe
[   11.630100] /sbin/modprobe 
[   11.630100] -q 
[   11.630101] -- 
[   11.630102] crypto-aes 

[   11.630979] == ret: 100
[   11.630980] == KWIFEXITED(ret): 01
[   11.631000] KWEXITSTATUS(ret): 1
[   11.631010] umh: sub_info->path: /sbin/modprobe
[   11.631012] /sbin/modprobe 
[   11.631012] -q 
[   11.631012] -- 
[   11.631013] cryptomgr 

[   11.631844] == ret: 00
[   11.631845] == KWIFEXITED(ret): 01
[   11.631846] KWEXITSTATUS(ret): 0
[   11.632291] umh: sub_info->path: /sbin/modprobe
[   11.632293] /sbin/modprobe 
[   11.632293] -q 
[   11.632294] -- 
[   11.632295] crypto-ctr(aes) 

[   11.633134] == ret: 100
[   11.633136] == KWIFEXITED(ret): 01
[   11.633136] KWEXITSTATUS(ret): 1
[   11.633211] umh: sub_info->path: /sbin/modprobe
[   11.633213] /sbin/modprobe 
[   11.633213] -q 
[   11.633214] -- 
[   11.633215] crypto-ctr 

[   11.639862] == ret: 00
[   11.639863] == KWIFEXITED(ret): 01
[   11.639864] KWEXITSTATUS(ret): 0
[   12.291736] device-mapper: uevent: version 1.0.3
[   12.291869] device-mapper: ioctl: 4.42.0-ioctl (2020-02-27) initialised: dm-devel@redhat.com
[   12.811519] kauditd_printk_skb: 81 callbacks suppressed
[   12.811520] audit: type=1130 audit(1593597955.691:93): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dmraid-activation comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   12.811526] audit: type=1131 audit(1593597955.691:94): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dmraid-activation comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   12.827013] audit: type=1130 audit(1593597955.711:95): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-fsck@dev-disk-by\x2dpath-ccw\x2d0.0.3319\x2dpart1 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   12.832192] XFS (dasdb1): Mounting V5 Filesystem
[   12.842643] RPC: Registered named UNIX socket transport module.
[   12.842645] RPC: Registered udp transport module.
[   12.842646] RPC: Registered tcp transport module.
[   12.842647] RPC: Registered tcp NFSv4.1 backchannel transport module.
[   12.855633] XFS (dasdb1): Ending clean mount
[   12.857782] xfs filesystem being mounted at /home supports timestamps until 2038 (0x7fffffff)
[   12.876314] audit: type=1130 audit(1593597955.761:96): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dracut-shutdown comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   12.896111] audit: type=1130 audit(1593597955.781:97): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=plymouth-read-write comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   12.896302] audit: type=1130 audit(1593597955.781:98): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=import-state comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   12.899463] RPC: Registered rdma transport module.
[   12.899466] RPC: Registered rdma backchannel transport module.
[   12.900337] audit: type=1130 audit(1593597955.781:99): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=rdma-load-modules@rdma comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   12.916905] audit: type=1130 audit(1593597955.801:100): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=rdma comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   12.952843] audit: type=1130 audit(1593597955.841:101): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-tmpfiles-setup comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   12.962072] audit: type=1305 audit(1593597955.841:102): op=set audit_enabled=1 old=1 auid=4294967295 ses=4294967295 subj=system_u:system_r:auditd_t:s0 res=1
[   13.444427] umh: sub_info->path: /sbin/modprobe
[   13.444430] /sbin/modprobe 
[   13.444430] -q 
[   13.444431] -- 
[   13.444432] iptable_filter 

[   13.485947] == ret: 00
[   13.485949] == KWIFEXITED(ret): 01
[   13.485950] KWEXITSTATUS(ret): 0
[   13.525155] umh: sub_info->path: /sbin/modprobe
[   13.525156] /sbin/modprobe 
[   13.525157] -q 
[   13.525157] -- 
[   13.525158] ip6table_filter 

[   13.575540] == ret: 00
[   13.575542] == KWIFEXITED(ret): 01
[   13.575543] KWEXITSTATUS(ret): 0
[   13.590423] umh: sub_info->path: /sbin/modprobe
[   13.590425] /sbin/modprobe 
[   13.590425] -q 
[   13.590426] -- 
[   13.590427] net-pf-16-proto-12 

[   13.617106] == ret: 00
[   13.617108] == KWIFEXITED(ret): 01
[   13.617109] KWEXITSTATUS(ret): 0
[   13.617147] umh: sub_info->path: /sbin/modprobe
[   13.617148] /sbin/modprobe 
[   13.617149] -q 
[   13.617149] -- 
[   13.617150] nfnetlink-subsys-6 

[   13.638728] == ret: 00
[   13.638729] == KWIFEXITED(ret): 01
[   13.638730] KWEXITSTATUS(ret): 0
[   13.831375] umh: sub_info->path: /sbin/modprobe
[   13.831377] /sbin/modprobe 
[   13.831378] -q 
[   13.831379] -- 
[   13.831379] iptable_security 

[   13.846372] == ret: 00
[   13.846374] == KWIFEXITED(ret): 01
[   13.846375] KWEXITSTATUS(ret): 0
[   13.847932] umh: sub_info->path: /sbin/modprobe
[   13.847934] /sbin/modprobe 
[   13.847934] -q 
[   13.847935] -- 
[   13.847935] iptable_raw 

[   13.854549] == ret: 00
[   13.854550] == KWIFEXITED(ret): 01
[   13.854551] KWEXITSTATUS(ret): 0
[   13.855856] umh: sub_info->path: /sbin/modprobe
[   13.855857] /sbin/modprobe 
[   13.855857] -q 
[   13.855858] -- 
[   13.855858] iptable_mangle 

[   13.860364] == ret: 00
[   13.860365] == KWIFEXITED(ret): 01
[   13.860366] KWEXITSTATUS(ret): 0
[   13.861761] umh: sub_info->path: /sbin/modprobe
[   13.861762] /sbin/modprobe 
[   13.861763] -q 
[   13.861763] -- 
[   13.861764] iptable_nat 

[   13.911825] == ret: 00
[   13.911826] == KWIFEXITED(ret): 01
[   13.911827] KWEXITSTATUS(ret): 0
[   13.914686] umh: sub_info->path: /sbin/modprobe
[   13.914688] /sbin/modprobe 
[   13.914688] -q 
[   13.914689] -- 
[   13.914689] ip6table_security 

[   13.945628] == ret: 00
[   13.945630] == KWIFEXITED(ret): 01
[   13.945630] KWEXITSTATUS(ret): 0
[   13.947019] umh: sub_info->path: /sbin/modprobe
[   13.947020] /sbin/modprobe 
[   13.947020] -q 
[   13.947021] -- 
[   13.947022] ip6table_raw 

[   13.969857] == ret: 00
[   13.969859] == KWIFEXITED(ret): 01
[   13.969860] KWEXITSTATUS(ret): 0
[   13.971549] umh: sub_info->path: /sbin/modprobe
[   13.971551] /sbin/modprobe 
[   13.971552] -q 
[   13.971553] -- 
[   13.971553] ip6table_mangle 

[   13.977724] == ret: 00
[   13.977725] == KWIFEXITED(ret): 01
[   13.977726] KWEXITSTATUS(ret): 0
[   13.979392] umh: sub_info->path: /sbin/modprobe
[   13.979393] /sbin/modprobe 
[   13.979394] -q 
[   13.979394] -- 
[   13.979395] ip6table_nat 

[   13.985103] == ret: 00
[   13.985104] == KWIFEXITED(ret): 01
[   13.985105] KWEXITSTATUS(ret): 0
[   14.014760] umh: sub_info->path: /sbin/modprobe
[   14.014762] /sbin/modprobe 
[   14.014762] -q 
[   14.014762] -- 
[   14.014763] ipt_conntrack 

[   14.039306] mlx5_core 0001:00:00.0 enP1s519np0: Link up
[   14.041835] IPv6: ADDRCONF(NETDEV_CHANGE): enP1s519np0: link becomes ready
[   14.047723] == ret: 00
[   14.047725] == KWIFEXITED(ret): 01
[   14.047726] KWEXITSTATUS(ret): 0
[   14.048012] umh: sub_info->path: /sbin/modprobe
[   14.048014] /sbin/modprobe 
[   14.048014] -q 
[   14.048015] -- 
[   14.048015] ipt_REJECT 

[   14.073846] == ret: 00
[   14.073848] == KWIFEXITED(ret): 01
[   14.073848] KWEXITSTATUS(ret): 0
[   14.077239] umh: sub_info->path: /sbin/modprobe
[   14.077241] /sbin/modprobe 
[   14.077241] -q 
[   14.077242] -- 
[   14.077242] ip6t_rpfilter 

[   14.099370] == ret: 00
[   14.099372] == KWIFEXITED(ret): 01
[   14.099373] KWEXITSTATUS(ret): 0
[   14.100300] umh: sub_info->path: /sbin/modprobe
[   14.100301] /sbin/modprobe 
[   14.100302] -q 
[   14.100302] -- 
[   14.100303] ip6t_REJECT 

[   14.122176] == ret: 00
[   14.122177] == KWIFEXITED(ret): 01
[   14.122178] KWEXITSTATUS(ret): 0
[   14.128047] umh: sub_info->path: /sbin/modprobe
[   14.128048] /sbin/modprobe 
[   14.128048] -q 
[   14.128049] -- 
[   14.128049] ip6t_tcp 

[   14.149302] mlx5_core 0002:00:00.0 enP2s564np0: Link up
[   14.173411] == ret: 00
[   14.173413] == KWIFEXITED(ret): 01
[   14.173414] KWEXITSTATUS(ret): 0
[   14.322406] umh: sub_info->path: /sbin/modprobe
[   14.322409] /sbin/modprobe 
[   14.322410] -q 
[   14.322410] -- 
[   14.322411] rtnl-link-bridge 

[   14.389015] == ret: 00
[   14.389017] == KWIFEXITED(ret): 01
[   14.389018] KWEXITSTATUS(ret): 0
[   14.389335] umh: sub_info->path: /sbin/modprobe
[   14.389336] /sbin/modprobe 
[   14.389336] -q 
[   14.389337] -- 
[   14.389338] char-major-10-200 

[   14.437420] tun: Universal TUN/TAP device driver, 1.6
[   14.437604] == ret: 00
[   14.437606] == KWIFEXITED(ret): 01
[   14.437606] KWEXITSTATUS(ret): 0
[   14.438482] virbr0: port 1(virbr0-nic) entered blocking state
[   14.438485] virbr0: port 1(virbr0-nic) entered disabled state
[   14.438635] device virbr0-nic entered promiscuous mode
[   14.439654] umh: sub_info->path: /sbin/bridge-stp
[   14.439656] /sbin/bridge-stp 
[   14.439656] virbr0 
[   14.439656] start 

[   14.439734] == ret: 00
[   14.439735] == KWIFEXITED(ret): 01
[   14.439736] KWEXITSTATUS(ret): 0
[   14.567072] umh: sub_info->path: /sbin/modprobe
[   14.567074] /sbin/modprobe 
[   14.567075] -q 
[   14.567075] -- 
[   14.567076] ipt_CT 

[   14.576352] == ret: 00
[   14.576353] == KWIFEXITED(ret): 01
[   14.576354] KWEXITSTATUS(ret): 0
[   14.576443] umh: sub_info->path: /sbin/modprobe
[   14.576444] /sbin/modprobe 
[   14.576444] -q 
[   14.576445] -- 
[   14.576445] nfct-helper-tftp 

[   14.605182] == ret: 00
[   14.605183] == KWIFEXITED(ret): 01
[   14.605184] KWEXITSTATUS(ret): 0
[   14.675162] umh: sub_info->path: /sbin/modprobe
[   14.675164] /sbin/modprobe 
[   14.675164] -q 
[   14.675165] -- 
[   14.675165] ipt_MASQUERADE 

[   14.684223] == ret: 00
[   14.684224] == KWIFEXITED(ret): 01
[   14.684225] KWEXITSTATUS(ret): 0
[   14.701460] umh: sub_info->path: /sbin/modprobe
[   14.701462] /sbin/modprobe 
[   14.701462] -q 
[   14.701462] -- 
[   14.701463] ipt_CHECKSUM 

[   14.707019] == ret: 00
[   14.707021] == KWIFEXITED(ret): 01
[   14.707022] KWEXITSTATUS(ret): 0
[   14.707948] virbr0: port 1(virbr0-nic) entered blocking state
[   14.731752] virbr0: port 1(virbr0-nic) entered disabled state
[   14.981871] umh: sub_info->path: /sbin/modprobe
[   14.981873] /sbin/modprobe 
[   14.981873] -q 
[   14.981874] -- 
[   14.981874] char-major-10-232 

[   15.014374] == ret: 00
[   15.014375] == KWIFEXITED(ret): 01
[   15.014376] KWEXITSTATUS(ret): 0
[   15.102392] IPv6: ADDRCONF(NETDEV_CHANGE): enP2s564np0: link becomes ready
[   20.341424] virbr0: port 2(vnet0) entered blocking state
[   20.341429] virbr0: port 2(vnet0) entered disabled state
[   20.341514] device vnet0 entered promiscuous mode
[   20.341714] virbr0: port 2(vnet0) entered blocking state
[   20.341988] umh: sub_info->path: /sbin/modprobe
[   20.341989] /sbin/modprobe 
[   20.341989] -q 
[   20.341990] -- 
[   20.341991] char-major-10-238 

[   20.408823] == ret: 00
[   20.408825] == KWIFEXITED(ret): 01
[   20.408826] KWEXITSTATUS(ret): 0

Tetsuo Handa July 1, 2020, 1:24 p.m. UTC | #22

On 2020/07/01 19:08, Christian Borntraeger wrote:
> 
> 
> On 30.06.20 19:57, Luis Chamberlain wrote:
>> On Fri, Jun 26, 2020 at 02:54:10AM +0000, Luis Chamberlain wrote:
>>> On Wed, Jun 24, 2020 at 08:37:55PM +0200, Christian Borntraeger wrote:
>>>>
>>>>
>>>> On 24.06.20 20:32, Christian Borntraeger wrote:
>>>> [...]> 
>>>>> So the translations look correct. But your change is actually a sematic change
>>>>> if(ret) will only trigger if there is an error
>>>>> if (KWIFEXITED(ret)) will always trigger when the process ends. So we will always overwrite -ECHILD
>>>>> and we did not do it before. 
>>>>>
>>>>
>>>> So the right fix is
>>>>
>>>> diff --git a/kernel/umh.c b/kernel/umh.c
>>>> index f81e8698e36e..a3a3196e84d1 100644
>>>> --- a/kernel/umh.c
>>>> +++ b/kernel/umh.c
>>>> @@ -154,7 +154,7 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
>>>>                  * the real error code is already in sub_info->retval or
>>>>                  * sub_info->retval is 0 anyway, so don't mess with it then.
>>>>                  */
>>>> -               if (KWIFEXITED(ret))
>>>> +               if (KWEXITSTATUS(ret))
>>>>                         sub_info->retval = KWEXITSTATUS(ret);

Well, it is not br_stp_call_user() but br_stp_start() which is expecting
to set sub_info->retval for both KWIFEXITED() case and KWIFSIGNALED() case.
That is, sub_info->retval needs to carry raw value (i.e. without "umh: fix
processed error when UMH_WAIT_PROC is used" will be the correct behavior).

Luis Chamberlain July 1, 2020, 1:46 p.m. UTC | #23

On Wed, Jul 01, 2020 at 12:08:11PM +0200, Christian Borntraeger wrote:
> dmesg attached
> [   14.438482] virbr0: port 1(virbr0-nic) entered blocking state
> [   14.438485] virbr0: port 1(virbr0-nic) entered disabled state
> [   14.438635] device virbr0-nic entered promiscuous mode
> [   14.439654] umh: sub_info->path: /sbin/bridge-stp
> [   14.439656] /sbin/bridge-stp 
> [   14.439656] virbr0 
> [   14.439656] start 

OK so what we seem to want to debug is the umh call for:

/sbin/bridge-stp virbr0 start

> [   14.439734] == ret: 00
> [   14.439735] == KWIFEXITED(ret): 01
> [   14.439736] KWEXITSTATUS(ret): 0

Its not clear if this is the respective return value, but now
that we have a clue that this is the the only non-modprobe
call, we should have a clearer certainty that this is the
issue.

Indeed my patch "umh: fix processed error when UMH_WAIT_PROC is used"
did modify bridge-stp in the following way:

diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
index ba55851fe132..bdd94b45396b 100644
--- a/net/bridge/br_stp_if.c
+++ b/net/bridge/br_stp_if.c
@@ -133,14 +133,8 @@ static int br_stp_call_user(struct net_bridge *br, char *arg)
 
 	/* call userspace STP and report program errors */
 	rc = call_usermodehelper(BR_STP_PROG, argv, envp, UMH_WAIT_PROC);
-	if (rc > 0) {
-		if (rc & 0xff)
-			br_debug(br, BR_STP_PROG " received signal %d\n",
-				 rc & 0x7f);
-		else
-			br_debug(br, BR_STP_PROG " exited with code %d\n",
-				 (rc >> 8) & 0xff);
-	}
+	if (rc != 0)
+		br_debug(br, BR_STP_PROG " failed with exit code %d\n", rc);
 
 	return rc;
 }

If you look at this carefully though you'll notice that the change just
modifies *when* we issue the debug print. The more important relevant
part of the patch however was that we now do return a correct error
value when the call fails.

More importantly, depending on if an error or not we run the kernel STP
or userspace STP later:

static void br_stp_start(struct net_bridge *br)
{
	int err = -ENOENT;

	if (net_eq(dev_net(br->dev), &init_net))
		err = br_stp_call_user(br, "start");

	if (err && err != -ENOENT)
		br_err(br, "failed to start userspace STP (%d)\n", err);

	spin_lock_bh(&br->lock);

	if (br->bridge_forward_delay < BR_MIN_FORWARD_DELAY)
		__br_set_forward_delay(br, BR_MIN_FORWARD_DELAY);
	else if (br->bridge_forward_delay > BR_MAX_FORWARD_DELAY)
		__br_set_forward_delay(br, BR_MAX_FORWARD_DELAY);

--------------------->  can you enable debug print for this to see what
--------------------->  you end up using?
	if (!err) {
		br->stp_enabled = BR_USER_STP;
		br_debug(br, "userspace STP started\n");
	} else {
		br->stp_enabled = BR_KERNEL_STP;
		br_debug(br, "using kernel STP\n");

		/* To start timers on any ports left in blocking */
		if (br->dev->flags & IFF_UP)
			mod_timer(&br->hello_timer, jiffies + br->hello_time);
		br_port_state_selection(br);
	}
----------------->

	spin_unlock_bh(&br->lock);
}

Luis Chamberlain July 1, 2020, 1:53 p.m. UTC | #24

On Wed, Jul 01, 2020 at 10:24:29PM +0900, Tetsuo Handa wrote:
> On 2020/07/01 19:08, Christian Borntraeger wrote:
> > 
> > 
> > On 30.06.20 19:57, Luis Chamberlain wrote:
> >> On Fri, Jun 26, 2020 at 02:54:10AM +0000, Luis Chamberlain wrote:
> >>> On Wed, Jun 24, 2020 at 08:37:55PM +0200, Christian Borntraeger wrote:
> >>>>
> >>>>
> >>>> On 24.06.20 20:32, Christian Borntraeger wrote:
> >>>> [...]> 
> >>>>> So the translations look correct. But your change is actually a sematic change
> >>>>> if(ret) will only trigger if there is an error
> >>>>> if (KWIFEXITED(ret)) will always trigger when the process ends. So we will always overwrite -ECHILD
> >>>>> and we did not do it before. 
> >>>>>
> >>>>
> >>>> So the right fix is
> >>>>
> >>>> diff --git a/kernel/umh.c b/kernel/umh.c
> >>>> index f81e8698e36e..a3a3196e84d1 100644
> >>>> --- a/kernel/umh.c
> >>>> +++ b/kernel/umh.c
> >>>> @@ -154,7 +154,7 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
> >>>>                  * the real error code is already in sub_info->retval or
> >>>>                  * sub_info->retval is 0 anyway, so don't mess with it then.
> >>>>                  */
> >>>> -               if (KWIFEXITED(ret))
> >>>> +               if (KWEXITSTATUS(ret))
> >>>>                         sub_info->retval = KWEXITSTATUS(ret);
> 
> Well, it is not br_stp_call_user() but br_stp_start() which is expecting
> to set sub_info->retval for both KWIFEXITED() case and KWIFSIGNALED() case.
> That is, sub_info->retval needs to carry raw value (i.e. without "umh: fix
> processed error when UMH_WAIT_PROC is used" will be the correct behavior).

br_stp_start() doesn't check for the raw value, it just checks for err
or !err. So the patch, "umh: fix processed error when UMH_WAIT_PROC is
used" propagates the correct error now.

Christian, can you try removing the binary temporarily and seeing if
you get your bridge working?

  Luis

Tetsuo Handa July 1, 2020, 2:08 p.m. UTC | #25

On 2020/07/01 22:53, Luis Chamberlain wrote:
>> Well, it is not br_stp_call_user() but br_stp_start() which is expecting
>> to set sub_info->retval for both KWIFEXITED() case and KWIFSIGNALED() case.
>> That is, sub_info->retval needs to carry raw value (i.e. without "umh: fix
>> processed error when UMH_WAIT_PROC is used" will be the correct behavior).
> 
> br_stp_start() doesn't check for the raw value, it just checks for err
> or !err. So the patch, "umh: fix processed error when UMH_WAIT_PROC is
> used" propagates the correct error now.

No. If "/sbin/bridge-stp virbr0 start" terminated due to e.g. SIGSEGV
(for example, by inserting "kill -SEGV $$" into right after "#!/bin/sh" line),
br_stp_start() needs to select BR_KERNEL_STP path. We can't assume that
/sbin/bridge-stp is always terminated by exit() syscall (and hence we can't
ignore KWIFSIGNALED() case in call_usermodehelper_exec_sync()).

Christian Borntraeger July 1, 2020, 3:26 p.m. UTC | #26

On 01.07.20 15:53, Luis Chamberlain wrote:
> On Wed, Jul 01, 2020 at 10:24:29PM +0900, Tetsuo Handa wrote:
>> On 2020/07/01 19:08, Christian Borntraeger wrote:
>>>
>>>
>>> On 30.06.20 19:57, Luis Chamberlain wrote:
>>>> On Fri, Jun 26, 2020 at 02:54:10AM +0000, Luis Chamberlain wrote:
>>>>> On Wed, Jun 24, 2020 at 08:37:55PM +0200, Christian Borntraeger wrote:
>>>>>>
>>>>>>
>>>>>> On 24.06.20 20:32, Christian Borntraeger wrote:
>>>>>> [...]> 
>>>>>>> So the translations look correct. But your change is actually a sematic change
>>>>>>> if(ret) will only trigger if there is an error
>>>>>>> if (KWIFEXITED(ret)) will always trigger when the process ends. So we will always overwrite -ECHILD
>>>>>>> and we did not do it before. 
>>>>>>>
>>>>>>
>>>>>> So the right fix is
>>>>>>
>>>>>> diff --git a/kernel/umh.c b/kernel/umh.c
>>>>>> index f81e8698e36e..a3a3196e84d1 100644
>>>>>> --- a/kernel/umh.c
>>>>>> +++ b/kernel/umh.c
>>>>>> @@ -154,7 +154,7 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
>>>>>>                  * the real error code is already in sub_info->retval or
>>>>>>                  * sub_info->retval is 0 anyway, so don't mess with it then.
>>>>>>                  */
>>>>>> -               if (KWIFEXITED(ret))
>>>>>> +               if (KWEXITSTATUS(ret))
>>>>>>                         sub_info->retval = KWEXITSTATUS(ret);
>>
>> Well, it is not br_stp_call_user() but br_stp_start() which is expecting
>> to set sub_info->retval for both KWIFEXITED() case and KWIFSIGNALED() case.
>> That is, sub_info->retval needs to carry raw value (i.e. without "umh: fix
>> processed error when UMH_WAIT_PROC is used" will be the correct behavior).
> 
> br_stp_start() doesn't check for the raw value, it just checks for err
> or !err. So the patch, "umh: fix processed error when UMH_WAIT_PROC is
> used" propagates the correct error now.
> 
> Christian, can you try removing the binary temporarily and seeing if
> you get your bridge working?

As I matter of fact I do NOT have /sbin/bridge-stp installed.

Luis Chamberlain July 1, 2020, 3:38 p.m. UTC | #27

On Wed, Jul 01, 2020 at 11:08:57PM +0900, Tetsuo Handa wrote:
> On 2020/07/01 22:53, Luis Chamberlain wrote:
> >> Well, it is not br_stp_call_user() but br_stp_start() which is expecting
> >> to set sub_info->retval for both KWIFEXITED() case and KWIFSIGNALED() case.
> >> That is, sub_info->retval needs to carry raw value (i.e. without "umh: fix
> >> processed error when UMH_WAIT_PROC is used" will be the correct behavior).
> > 
> > br_stp_start() doesn't check for the raw value, it just checks for err
> > or !err. So the patch, "umh: fix processed error when UMH_WAIT_PROC is
> > used" propagates the correct error now.
> 
> No. If "/sbin/bridge-stp virbr0 start" terminated due to e.g. SIGSEGV
> (for example, by inserting "kill -SEGV $$" into right after "#!/bin/sh" line),
> br_stp_start() needs to select BR_KERNEL_STP path. We can't assume that
> /sbin/bridge-stp is always terminated by exit() syscall (and hence we can't
> ignore KWIFSIGNALED() case in call_usermodehelper_exec_sync()).

Ah, well that would be a different fix required, becuase again,
br_stp_start() does not untangle the correct error today really.
I also I think it would be odd odd that SIGSEGV or another signal 
is what was terminating Christian's bridge stp call, but let's
find out!

Note we pass 0 to the options to wait so the mistake here could indeed
be that we did not need KWIFSIGNALED(). I was afraid of this prospect...
as it other implications.

It means we either *open code* all callers, or we handle this in a
unified way on the umh. And if we do handle this in a unified way, it
then begs the question as to *what* do we pass for the signals case and
continued case. Below we just pass the signal, and treat continued as
OK, but treating continued as OK would also be a *new* change as well.

For instance (this goes just boot tested, but Christian if you can
try this as well that would be appreciated):

diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index bba06befbff5..d1898f5dd1fc 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -105,10 +105,12 @@ extern long kernel_wait4(pid_t, int __user *, int, struct rusage *);
 
 /* Only add helpers for actual use cases in the kernel */
 #define KWEXITSTATUS(status)		(__KWEXITSTATUS(status))
+#define KWTERMSIG(status)		(__KWTERMSIG(status))
+#define KWSTOPSIG(status)		(__KWSTOPSIG(status))
 #define KWIFEXITED(status)		(__KWIFEXITED(status))
-
-/* Nonzero if STATUS indicates normal termination.  */
-#define __KWIFEXITED(status)     (__KWTERMSIG(status) == 0)
+#define KWIFSIGNALED(status)		(__KWIFSIGNALED(status))
+#define KWIFSTOPPED(status)		(__KWIFSTOPPED(status))
+#define KWIFCONTINUED(status)		(__KWIFCONTINUED(status))
 
 /* If KWIFEXITED(STATUS), the low-order 8 bits of the status.  */
 #define __KWEXITSTATUS(status)   (((status) & 0xff00) >> 8)
@@ -116,6 +118,24 @@ extern long kernel_wait4(pid_t, int __user *, int, struct rusage *);
 /* If KWIFSIGNALED(STATUS), the terminating signal.  */
 #define __KWTERMSIG(status)      ((status) & 0x7f)
 
+/* If KWIFSTOPPED(STATUS), the signal that stopped the child.  */
+#define __KWSTOPSIG(status)      __KWEXITSTATUS(status)
+
+/* Nonzero if STATUS indicates normal termination.  */
+#define __KWIFEXITED(status)     (__KWTERMSIG(status) == 0)
+
+/* Nonzero if STATUS indicates termination by a signal.  */
+#define __KWIFSIGNALED(status) \
+	(((signed char) (((status) & 0x7f) + 1) >> 1) > 0)
+
+/* Nonzero if STATUS indicates the child is stopped.  */
+#define __KWIFSTOPPED(status)    (((status) & 0xff) == 0x7f)
+
+/* Nonzero if STATUS indicates the child continued after a stop. */
+#define __KWIFCONTINUED(status) ((status) == __KW_CONTINUED)
+
+#define __KW_CONTINUED		0xffff
+
 extern void free_task(struct task_struct *tsk);
 
 /* sched_exec is called by processes performing an exec */
diff --git a/kernel/umh.c b/kernel/umh.c
index f81e8698e36e..c98fb1ed90c9 100644
--- a/kernel/umh.c
+++ b/kernel/umh.c
@@ -156,6 +156,18 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
 		 */
 		if (KWIFEXITED(ret))
 			sub_info->retval = KWEXITSTATUS(ret);
+		/*
+		 * Do we really want to be passing the signal, or do we pass
+		 * a single error code for all cases?
+		 */
+		else if (KWIFSIGNALED(ret))
+			sub_info->retval = KWTERMSIG(ret);
+		/* Same here */
+		else if (KWIFSTOPPED((ret)))
+			sub_info->retval = KWSTOPSIG(ret);
+		/* And are we really sure we want this? */
+		else if (KWIFCONTINUED((ret)))
+			sub_info->retval = 0;
 	}
 
 	/* Restore default kernel sig handler */

Christian Borntraeger July 1, 2020, 3:48 p.m. UTC | #28

On 01.07.20 17:38, Luis Chamberlain wrote:
> On Wed, Jul 01, 2020 at 11:08:57PM +0900, Tetsuo Handa wrote:
>> On 2020/07/01 22:53, Luis Chamberlain wrote:
>>>> Well, it is not br_stp_call_user() but br_stp_start() which is expecting
>>>> to set sub_info->retval for both KWIFEXITED() case and KWIFSIGNALED() case.
>>>> That is, sub_info->retval needs to carry raw value (i.e. without "umh: fix
>>>> processed error when UMH_WAIT_PROC is used" will be the correct behavior).
>>>
>>> br_stp_start() doesn't check for the raw value, it just checks for err
>>> or !err. So the patch, "umh: fix processed error when UMH_WAIT_PROC is
>>> used" propagates the correct error now.
>>
>> No. If "/sbin/bridge-stp virbr0 start" terminated due to e.g. SIGSEGV
>> (for example, by inserting "kill -SEGV $$" into right after "#!/bin/sh" line),
>> br_stp_start() needs to select BR_KERNEL_STP path. We can't assume that
>> /sbin/bridge-stp is always terminated by exit() syscall (and hence we can't
>> ignore KWIFSIGNALED() case in call_usermodehelper_exec_sync()).
> 
> Ah, well that would be a different fix required, becuase again,
> br_stp_start() does not untangle the correct error today really.
> I also I think it would be odd odd that SIGSEGV or another signal 
> is what was terminating Christian's bridge stp call, but let's
> find out!
> 
> Note we pass 0 to the options to wait so the mistake here could indeed
> be that we did not need KWIFSIGNALED(). I was afraid of this prospect...
> as it other implications.
> 
> It means we either *open code* all callers, or we handle this in a
> unified way on the umh. And if we do handle this in a unified way, it
> then begs the question as to *what* do we pass for the signals case and
> continued case. Below we just pass the signal, and treat continued as
> OK, but treating continued as OK would also be a *new* change as well.
> 
> For instance (this goes just boot tested, but Christian if you can
> try this as well that would be appreciated):


Does not help, the bridge stays in DOWN state.

Luis Chamberlain July 1, 2020, 3:58 p.m. UTC | #29

On Wed, Jul 01, 2020 at 05:48:48PM +0200, Christian Borntraeger wrote:
> 
> 
> On 01.07.20 17:38, Luis Chamberlain wrote:
> > On Wed, Jul 01, 2020 at 11:08:57PM +0900, Tetsuo Handa wrote:
> >> On 2020/07/01 22:53, Luis Chamberlain wrote:
> >>>> Well, it is not br_stp_call_user() but br_stp_start() which is expecting
> >>>> to set sub_info->retval for both KWIFEXITED() case and KWIFSIGNALED() case.
> >>>> That is, sub_info->retval needs to carry raw value (i.e. without "umh: fix
> >>>> processed error when UMH_WAIT_PROC is used" will be the correct behavior).
> >>>
> >>> br_stp_start() doesn't check for the raw value, it just checks for err
> >>> or !err. So the patch, "umh: fix processed error when UMH_WAIT_PROC is
> >>> used" propagates the correct error now.
> >>
> >> No. If "/sbin/bridge-stp virbr0 start" terminated due to e.g. SIGSEGV
> >> (for example, by inserting "kill -SEGV $$" into right after "#!/bin/sh" line),
> >> br_stp_start() needs to select BR_KERNEL_STP path. We can't assume that
> >> /sbin/bridge-stp is always terminated by exit() syscall (and hence we can't
> >> ignore KWIFSIGNALED() case in call_usermodehelper_exec_sync()).
> > 
> > Ah, well that would be a different fix required, becuase again,
> > br_stp_start() does not untangle the correct error today really.
> > I also I think it would be odd odd that SIGSEGV or another signal 
> > is what was terminating Christian's bridge stp call, but let's
> > find out!
> > 
> > Note we pass 0 to the options to wait so the mistake here could indeed
> > be that we did not need KWIFSIGNALED(). I was afraid of this prospect...
> > as it other implications.
> > 
> > It means we either *open code* all callers, or we handle this in a
> > unified way on the umh. And if we do handle this in a unified way, it
> > then begs the question as to *what* do we pass for the signals case and
> > continued case. Below we just pass the signal, and treat continued as
> > OK, but treating continued as OK would also be a *new* change as well.
> > 
> > For instance (this goes just boot tested, but Christian if you can
> > try this as well that would be appreciated):
> 
> 
> Does not help, the bridge stays in DOWN state. 

OK thanks for testing, that was fast! Does your code go through the
STP kernel path or userpath? If it is taking the STP kernel path
then this is not the real culprit to your issue then.

  Luis

Christian Borntraeger July 1, 2020, 4:01 p.m. UTC | #30

On 01.07.20 17:58, Luis Chamberlain wrote:
[...]

>>>
>>> Ah, well that would be a different fix required, becuase again,
>>> br_stp_start() does not untangle the correct error today really.
>>> I also I think it would be odd odd that SIGSEGV or another signal 
>>> is what was terminating Christian's bridge stp call, but let's
>>> find out!
>>>
>>> Note we pass 0 to the options to wait so the mistake here could indeed
>>> be that we did not need KWIFSIGNALED(). I was afraid of this prospect...
>>> as it other implications.
>>>
>>> It means we either *open code* all callers, or we handle this in a
>>> unified way on the umh. And if we do handle this in a unified way, it
>>> then begs the question as to *what* do we pass for the signals case and
>>> continued case. Below we just pass the signal, and treat continued as
>>> OK, but treating continued as OK would also be a *new* change as well.
>>>
>>> For instance (this goes just boot tested, but Christian if you can
>>> try this as well that would be appreciated):
>>
>>
>> Does not help, the bridge stays in DOWN state. 
> 
> OK thanks for testing, that was fast! Does your code go through the
> STP kernel path or userpath? If it is taking the STP kernel path
> then this is not the real culprit to your issue then.

I have no idea and I cannot look into this right now. I can test
patches as compile,reboot and test is almost no effort.

FWIW, this is just the network of a KVM guest of libvirts default network
no longer working, maybe you can reproduce this on x86 as well?

Tetsuo Handa July 2, 2020, 4:26 a.m. UTC | #31

On 2020/07/02 0:38, Luis Chamberlain wrote:
> @@ -156,6 +156,18 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
>  		 */
>  		if (KWIFEXITED(ret))
>  			sub_info->retval = KWEXITSTATUS(ret);
> +		/*
> +		 * Do we really want to be passing the signal, or do we pass
> +		 * a single error code for all cases?
> +		 */
> +		else if (KWIFSIGNALED(ret))
> +			sub_info->retval = KWTERMSIG(ret);

No, this is bad. Caller of usermode helper is unable to distinguish exit(9)
and e.g. SIGKILL'ed by the OOM-killer. Please pass raw exit status value.

I feel that caller of usermode helper should not use exit status value.
For example, call_sbin_request_key() is checking

  test_bit(KEY_FLAG_USER_CONSTRUCT, &key->flags) || key_validate(key) < 0

condition (if usermode helper was invoked) in order to "ignore any errors from
userspace if the key was instantiated".

> +		/* Same here */
> +		else if (KWIFSTOPPED((ret)))
> +			sub_info->retval = KWSTOPSIG(ret);
> +		/* And are we really sure we want this? */
> +		else if (KWIFCONTINUED((ret)))
> +			sub_info->retval = 0;
>  	}
>  
>  	/* Restore default kernel sig handler */
>

Luis Chamberlain July 2, 2020, 7:46 p.m. UTC | #32

On Thu, Jul 02, 2020 at 01:26:53PM +0900, Tetsuo Handa wrote:
> On 2020/07/02 0:38, Luis Chamberlain wrote:
> > @@ -156,6 +156,18 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
> >  		 */
> >  		if (KWIFEXITED(ret))
> >  			sub_info->retval = KWEXITSTATUS(ret);
> > +		/*
> > +		 * Do we really want to be passing the signal, or do we pass
> > +		 * a single error code for all cases?
> > +		 */
> > +		else if (KWIFSIGNALED(ret))
> > +			sub_info->retval = KWTERMSIG(ret);
> 
> No, this is bad. Caller of usermode helper is unable to distinguish exit(9)
> and e.g. SIGKILL'ed by the OOM-killer.

Right, the question is: do we care?

> Please pass raw exit status value.
> 
> I feel that caller of usermode helper should not use exit status value.
> For example, call_sbin_request_key() is checking
> 
>   test_bit(KEY_FLAG_USER_CONSTRUCT, &key->flags) || key_validate(key) < 0
> 
> condition (if usermode helper was invoked) in order to "ignore any errors from
> userspace if the key was instantiated".

For those not familiar with this code path, or if you cannot decipher
the above, the code path in question was:

static int call_sbin_request_key(struct key *authkey, void *aux)                
{
	...
	/* do it */                                                             
	ret = call_usermodehelper_keys(request_key, argv, envp, keyring,        
				       UMH_WAIT_PROC);
	kdebug("usermode -> 0x%x", ret);
	if (ret >= 0) {
		/* ret is the exit/wait code */
		if (test_bit(KEY_FLAG_USER_CONSTRUCT, &key->flags) ||
		    key_validate(key) < 0)
		    ret = -ENOKEY;
		/* ignore any errors from userspace if the key was      
		 * instantiated */  
		 ret = 0;
	}
	...
}

And the umh patch "umh: fix processed error when UMH_WAIT_PROC is used"
changed this to:

-       if (ret >= 0) {
+       if (ret != 0) {

Prior to the patch negative return values from userspace were still
being captured, and likewise signals, but the error value was not
raw, not the actual value. After the patch, since we check for ret != 0
we still upkeep the sanity check for any error, correct the error value,
but as you noted signals were ignored as I made the wrong assumption
we would ignore them. The umh sub_info->retval is set after my original
patch only if KWIFSIGNALED(ret)), and ignored signals, and so that
would be now capitured with the additional KWIFSIGNALED(ret)) check.

The question still stands:

Do we want to open code all these checks or simply wrap them up in
the umh. If we do the later, as you note exit(9) and a SIGKILL will
be the same to the inspector in the kernel. But do we care?

Do we really want umh callers differntiatin between signals and exit values?

The alternative to making a compromise is using generic wrappers for
things which make sense and letting the callers use those.

  Luis

> > +		/* Same here */
> > +		else if (KWIFSTOPPED((ret)))
> > +			sub_info->retval = KWSTOPSIG(ret);
> > +		/* And are we really sure we want this? */
> > +		else if (KWIFCONTINUED((ret)))
> > +			sub_info->retval = 0;
> >  	}
> >  
> >  	/* Restore default kernel sig handler */
> > 
>

Tetsuo Handa July 3, 2020, 12:52 a.m. UTC | #33

On 2020/07/03 4:46, Luis Chamberlain wrote:
> On Thu, Jul 02, 2020 at 01:26:53PM +0900, Tetsuo Handa wrote:
>> On 2020/07/02 0:38, Luis Chamberlain wrote:
>>> @@ -156,6 +156,18 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
>>>  		 */
>>>  		if (KWIFEXITED(ret))
>>>  			sub_info->retval = KWEXITSTATUS(ret);
>>> +		/*
>>> +		 * Do we really want to be passing the signal, or do we pass
>>> +		 * a single error code for all cases?
>>> +		 */
>>> +		else if (KWIFSIGNALED(ret))
>>> +			sub_info->retval = KWTERMSIG(ret);
>>
>> No, this is bad. Caller of usermode helper is unable to distinguish exit(9)
>> and e.g. SIGKILL'ed by the OOM-killer.
> 
> Right, the question is: do we care?

Yes, we have to care.

> And the umh patch "umh: fix processed error when UMH_WAIT_PROC is used"
> changed this to:
> 
> -       if (ret >= 0) {
> +       if (ret != 0) {
> 
> Prior to the patch negative return values from userspace were still
> being captured, and likewise signals, but the error value was not
> raw, not the actual value. After the patch, since we check for ret != 0
> we still upkeep the sanity check for any error, correct the error value,
> but as you noted signals were ignored as I made the wrong assumption
> we would ignore them. The umh sub_info->retval is set after my original
> patch only if KWIFSIGNALED(ret)), and ignored signals, and so that
> would be now capitured with the additional KWIFSIGNALED(ret)) check.

"call_usermodehelper_keys() == 0" (i.e. usermode helper was successfully
started and successfully terminated via exit(0)) is different from "there is
nothing to do". call_sbin_request_key() == 0 case still has to check for
possibility of -ENOKEY case.

> 
> The question still stands:
> 
> Do we want to open code all these checks or simply wrap them up in
> the umh. If we do the later, as you note exit(9) and a SIGKILL will
> be the same to the inspector in the kernel. But do we care?

Yes, we do care.

> 
> Do we really want umh callers differntiatin between signals and exit values?

Yes, we do.

> 
> The alternative to making a compromise is using generic wrappers for
> things which make sense and letting the callers use those.

I suggest just introducing KWIFEXITED()/KWEXITSTATUS()/KWIFSIGNALED()/KWTERMSIG()
macros and fixing the callers, for some callers are not aware of possibility of
KWIFSIGNALED() case.

For example, conn_try_outdate_peer() in drivers/block/drbd/drbd_nl.c misbehaves if
drbd_usermode_helper process was terminated by a signal, for the switch() statement
after returning from conn_helper() is assuming that the return value of conn_helper()
is a KWEXITSTATUS() value if drbd_usermode_helper process was successfully started.
If drbd_usermode_helper process was terminated by SIGQUIT (which is 3),
conn_try_outdate_peer() will by error hit "case P_INCONSISTENT:" (which is 3);
conn_try_outdate_peer() should hit "default: /* The script is broken ... */"
unless KWIFEXITED() == true.

Your patch is trying to obnubilate the return code.

Luis Chamberlain July 3, 2020, 1:28 p.m. UTC | #34

On Fri, Jul 03, 2020 at 09:52:01AM +0900, Tetsuo Handa wrote:
> On 2020/07/03 4:46, Luis Chamberlain wrote:
> > The alternative to making a compromise is using generic wrappers for
> > things which make sense and letting the callers use those.
> 
> I suggest just introducing KWIFEXITED()/KWEXITSTATUS()/KWIFSIGNALED()/KWTERMSIG()
> macros and fixing the callers, for some callers are not aware of possibility of
> KWIFSIGNALED() case.

OK so we open code all uses. Do that in a next iteration.

  Luis

[4/5] umh: fix processed error when UMH_WAIT_PROC is used

Commit Message

Comments

Patch