Message ID | 1612260787-28015-1-git-send-email-wanghongzhe@huawei.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | [v1,1/1] Firstly, as Andy mentioned, this should be smp_rmb() instead of rmb(). considering that TSYNC is a cross-thread situation, and rmb() is a mandatory barrier which should not be used to control SMP effects, since mandatory barriers impose unnecessa | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On Tue, Feb 02, 2021 at 06:13:07PM +0800, wanghongzhe wrote:
> Secondly, the smp_rmb() should be put between reading SYSCALL_WORK_SECCOMP and reading
<snip>
Your subject line of the patch is a bit odd :)
Hi wanghongzhe, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on v5.11-rc6] [also build test WARNING on next-20210125] [cannot apply to kees/for-next/seccomp] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/wanghongzhe/Firstly-as-Andy-mentioned-this-should-be-smp_rmb-instead-of-rmb-considering-that-TSYNC-is-a-cross-thread-situation-and-r/20210202-173311 base: 1048ba83fb1c00cd24172e23e8263972f6b5d9ac config: i386-randconfig-s001-20210202 (attached as .config) compiler: gcc-9 (Debian 9.3.0-15) 9.3.0 reproduce: # apt-get install sparse # sparse version: v0.6.3-215-g0fb77bb6-dirty # https://github.com/0day-ci/linux/commit/f79414957fc8acb6b680bbcd26fa987328a5724a git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review wanghongzhe/Firstly-as-Andy-mentioned-this-should-be-smp_rmb-instead-of-rmb-considering-that-TSYNC-is-a-cross-thread-situation-and-r/20210202-173311 git checkout f79414957fc8acb6b680bbcd26fa987328a5724a # save the attached .config to linux build tree make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=i386 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): kernel/seccomp.c: In function '__secure_computing': >> kernel/seccomp.c:1313:6: warning: this statement may fall through [-Wimplicit-fallthrough=] 1313 | if (unlikely(current->seccomp.mode != SECCOMP_MODE_FILTER)) | ^ kernel/seccomp.c:1315:2: note: here 1315 | case SECCOMP_MODE_FILTER: | ^~~~ vim +1313 kernel/seccomp.c 1283 1284 int __secure_computing(const struct seccomp_data *sd) 1285 { 1286 int this_syscall; 1287 1288 if (IS_ENABLED(CONFIG_CHECKPOINT_RESTORE) && 1289 unlikely(current->ptrace & PT_SUSPEND_SECCOMP)) 1290 return 0; 1291 1292 this_syscall = sd ? sd->nr : 1293 syscall_get_nr(current, current_pt_regs()); 1294 1295 /* 1296 * Make sure that any changes to mode from another thread have 1297 * been seen after SYSCALL_WORK_SECCOMP was seen. 1298 */ 1299 smp_rmb(); 1300 1301 switch (current->seccomp.mode) { 1302 case SECCOMP_MODE_STRICT: 1303 __secure_computing_strict(this_syscall); /* may call do_exit */ 1304 return 0; 1305 /* 1306 * Make sure that change to mode (from SECCOMP_MODE_DISABLED to 1307 * SECCOMP_MODE_FILTER) from another thread using TSYNC ability 1308 * have been seen after SYSCALL_WORK_SECCOMP was seen. Read mode again behind 1309 * smp_rmb(), if it equals SECCOMP_MODE_FILTER, go to the right path. 1310 */ 1311 case SECCOMP_MODE_DISABLED: 1312 smp_rmb(); > 1313 if (unlikely(current->seccomp.mode != SECCOMP_MODE_FILTER)) 1314 BUG(); 1315 case SECCOMP_MODE_FILTER: 1316 return __seccomp_filter(this_syscall, sd, false); 1317 default: 1318 BUG(); 1319 } 1320 } 1321 #endif /* CONFIG_HAVE_ARCH_SECCOMP_FILTER */ 1322 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Tue, Feb 02, 2021 at 06:13:07PM +0800, wanghongzhe wrote: > Secondly, the smp_rmb() should be put between reading SYSCALL_WORK_SECCOMP and reading > seccomp.mode, not between reading seccomp.mode and seccomp->filter, to make > sure that any changes to mode from another thread have been seen after > SYSCALL_WORK_SECCOMP was seen, as the original comment shown. This issue seems to be > misintroduced at 13aa72f0fd0a9f98a41cefb662487269e2f1ad65 which aims to > refactor the filter callback and the API. So the intuitive solution is to put > it back like: > > Thirdly, however, we can go further to improve the performace of checking > syscall, considering that smp_rmb is always executed on the syscall-check > path at each time for both FILTER and STRICT check while the TSYNC case > which may lead to race condition is just a rare situation, and that in > some arch like Arm64 smp_rmb is dsb(ishld) not a cheap barrier() in x86-64. > > As a result, smp_rmb() should only be executed when necessary, e.g, it is > only necessary when current thread's mode is SECCOMP_MODE_DISABLED at the > first TYSNCed time, because after that the current thread's mode will always > be SECCOMP_MODE_FILTER (and SYSCALL_WORK_SECCOMP will always be set) and can not be > changed anymore by anyone. In other words, after that, any thread can not > change the mode (and SYSCALL_WORK_SECCOMP), so the race condition disappeared, and > no more smb_rmb() needed ever. > > So the solution is to read mode again behind smp_rmb() after the mode is seen > as SECCOMP_MODE_DISABLED by current thread at the first TSYNCed time, and if > the new mode don't equals to SECCOMP_MODE_FILTER, do BUG(), go to FILTER path > otherwise. > > RFC -> v1: > - replace rmb() with smp_rmb() > - move the smp_rmb() logic to the middle between SYSCALL_WORK_SECCOMP and mode > > Signed-off-by: wanghongzhe <wanghongzhe@huawei.com> > Reviewed-by: Andy Lutomirski <luto@amacapital.net> > --- > kernel/seccomp.c | 25 +++++++++++++++++-------- > 1 file changed, 17 insertions(+), 8 deletions(-) > > diff --git a/kernel/seccomp.c b/kernel/seccomp.c > index 952dc1c90229..a621fb913ec6 100644 > --- a/kernel/seccomp.c > +++ b/kernel/seccomp.c > @@ -1160,12 +1160,6 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd, > int data; > struct seccomp_data sd_local; > > - /* > - * Make sure that any changes to mode from another thread have > - * been seen after SYSCALL_WORK_SECCOMP was seen. > - */ > - rmb(); > - > if (!sd) { > populate_seccomp_data(&sd_local); > sd = &sd_local; > @@ -1289,7 +1283,6 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd, > > int __secure_computing(const struct seccomp_data *sd) > { > - int mode = current->seccomp.mode; > int this_syscall; > > if (IS_ENABLED(CONFIG_CHECKPOINT_RESTORE) && > @@ -1299,10 +1292,26 @@ int __secure_computing(const struct seccomp_data *sd) > this_syscall = sd ? sd->nr : > syscall_get_nr(current, current_pt_regs()); > > - switch (mode) { > + /* > + * Make sure that any changes to mode from another thread have > + * been seen after SYSCALL_WORK_SECCOMP was seen. > + */ > + smp_rmb(); Let's start with a patch that just replaces rmb() with smp_rmb() and then work on optimizing. Can you provide performance numbers that show rmb() (and soon smp_rmb()) is causing actual problems here? > + > + switch (current->seccomp.mode) { > case SECCOMP_MODE_STRICT: > __secure_computing_strict(this_syscall); /* may call do_exit */ > return 0; > + /* > + * Make sure that change to mode (from SECCOMP_MODE_DISABLED to > + * SECCOMP_MODE_FILTER) from another thread using TSYNC ability > + * have been seen after SYSCALL_WORK_SECCOMP was seen. Read mode again behind > + * smp_rmb(), if it equals SECCOMP_MODE_FILTER, go to the right path. > + */ > + case SECCOMP_MODE_DISABLED: > + smp_rmb(); > + if (unlikely(current->seccomp.mode != SECCOMP_MODE_FILTER)) > + BUG(); BUG() should never be used[1]. This is a recoverable situation, I think, and should be handled as such. -Kees [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on > case SECCOMP_MODE_FILTER: > return __seccomp_filter(this_syscall, sd, false); > default: > -- > 2.19.1 >
> Let's start with a patch that just replaces rmb() with smp_rmb() and then work > on optimizing. Can you provide performance numbers that show > rmb() (and soon smp_rmb()) is causing actual problems here? Ok, I will send a patch that just replaces rmb() with smp_rmb() and give performance numbers. > BUG() should never be used[1]. This is a recoverable situation, I think, and > should be handled as such. I just follow the default case behind. Let's discuss this issue in next patches.
diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 952dc1c90229..a621fb913ec6 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -1160,12 +1160,6 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd, int data; struct seccomp_data sd_local; - /* - * Make sure that any changes to mode from another thread have - * been seen after SYSCALL_WORK_SECCOMP was seen. - */ - rmb(); - if (!sd) { populate_seccomp_data(&sd_local); sd = &sd_local; @@ -1289,7 +1283,6 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd, int __secure_computing(const struct seccomp_data *sd) { - int mode = current->seccomp.mode; int this_syscall; if (IS_ENABLED(CONFIG_CHECKPOINT_RESTORE) && @@ -1299,10 +1292,26 @@ int __secure_computing(const struct seccomp_data *sd) this_syscall = sd ? sd->nr : syscall_get_nr(current, current_pt_regs()); - switch (mode) { + /* + * Make sure that any changes to mode from another thread have + * been seen after SYSCALL_WORK_SECCOMP was seen. + */ + smp_rmb(); + + switch (current->seccomp.mode) { case SECCOMP_MODE_STRICT: __secure_computing_strict(this_syscall); /* may call do_exit */ return 0; + /* + * Make sure that change to mode (from SECCOMP_MODE_DISABLED to + * SECCOMP_MODE_FILTER) from another thread using TSYNC ability + * have been seen after SYSCALL_WORK_SECCOMP was seen. Read mode again behind + * smp_rmb(), if it equals SECCOMP_MODE_FILTER, go to the right path. + */ + case SECCOMP_MODE_DISABLED: + smp_rmb(); + if (unlikely(current->seccomp.mode != SECCOMP_MODE_FILTER)) + BUG(); case SECCOMP_MODE_FILTER: return __seccomp_filter(this_syscall, sd, false); default: