Message ID | 20240410182558.41467-15-haitao.huang@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add Cgroup support for SGX EPC memory | expand |
On Wed Apr 10, 2024 at 9:25 PM EEST, Haitao Huang wrote: > To run selftests for EPC cgroup: > > sudo ./run_epc_cg_selftests.sh > > To watch misc cgroup 'current' changes during testing, run this in a > separate terminal: > > ./watch_misc_for_tests.sh current > > With different cgroups, the script starts one or multiple concurrent SGX > selftests (test_sgx), each to run the unclobbered_vdso_oversubscribed > test case, which loads an enclave of EPC size equal to the EPC capacity > available on the platform. The script checks results against the > expectation set for each cgroup and reports success or failure. > > The script creates 3 different cgroups at the beginning with following > expectations: > > 1) SMALL - intentionally small enough to fail the test loading an > enclave of size equal to the capacity. > 2) LARGE - large enough to run up to 4 concurrent tests but fail some if > more than 4 concurrent tests are run. The script starts 4 expecting at > least one test to pass, and then starts 5 expecting at least one test > to fail. > 3) LARGER - limit is the same as the capacity, large enough to run lots of > concurrent tests. The script starts 8 of them and expects all pass. > Then it reruns the same test with one process randomly killed and > usage checked to be zero after all processes exit. > > The script also includes a test with low mem_cg limit and LARGE sgx_epc > limit to verify that the RAM used for per-cgroup reclamation is charged > to a proper mem_cg. For this test, it turns off swapping before start, > and turns swapping back on afterwards. > > Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> > --- > V11: > - Remove cgroups-tools dependency and make scripts ash compatible. (Jarkko) > - Drop support for cgroup v1 and simplify. (Michal, Jarkko) > - Add documentation for functions. (Jarkko) > - Turn off swapping before memcontrol tests and back on after > - Format and style fixes, name for hard coded values > > V7: > - Added memcontrol test. > > V5: > - Added script with automatic results checking, remove the interactive > script. > - The script can run independent from the series below. > --- > tools/testing/selftests/sgx/ash_cgexec.sh | 16 + > .../selftests/sgx/run_epc_cg_selftests.sh | 275 ++++++++++++++++++ > .../selftests/sgx/watch_misc_for_tests.sh | 11 + > 3 files changed, 302 insertions(+) > create mode 100755 tools/testing/selftests/sgx/ash_cgexec.sh > create mode 100755 tools/testing/selftests/sgx/run_epc_cg_selftests.sh > create mode 100755 tools/testing/selftests/sgx/watch_misc_for_tests.sh > > diff --git a/tools/testing/selftests/sgx/ash_cgexec.sh b/tools/testing/selftests/sgx/ash_cgexec.sh > new file mode 100755 > index 000000000000..cfa5d2b0e795 > --- /dev/null > +++ b/tools/testing/selftests/sgx/ash_cgexec.sh > @@ -0,0 +1,16 @@ > +#!/usr/bin/env sh > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright(c) 2024 Intel Corporation. > + > +# Start a program in a given cgroup. > +# Supports V2 cgroup paths, relative to /sys/fs/cgroup > +if [ "$#" -lt 2 ]; then > + echo "Usage: $0 <v2 cgroup path> <command> [args...]" > + exit 1 > +fi > +# Move this shell to the cgroup. > +echo 0 >/sys/fs/cgroup/$1/cgroup.procs > +shift > +# Execute the command within the cgroup > +exec "$@" > + > diff --git a/tools/testing/selftests/sgx/run_epc_cg_selftests.sh b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh > new file mode 100755 > index 000000000000..dd56273056fc > --- /dev/null > +++ b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh > @@ -0,0 +1,275 @@ > +#!/usr/bin/env sh > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright(c) 2023, 2024 Intel Corporation. > + > +TEST_ROOT_CG=selftest > +TEST_CG_SUB1=$TEST_ROOT_CG/test1 > +TEST_CG_SUB2=$TEST_ROOT_CG/test2 > +# We will only set limit in test1 and run tests in test3 > +TEST_CG_SUB3=$TEST_ROOT_CG/test1/test3 > +TEST_CG_SUB4=$TEST_ROOT_CG/test4 > + > +# Cgroup v2 only > +CG_ROOT=/sys/fs/cgroup > +mkdir -p $CG_ROOT/$TEST_CG_SUB1 > +mkdir -p $CG_ROOT/$TEST_CG_SUB2 > +mkdir -p $CG_ROOT/$TEST_CG_SUB3 > +mkdir -p $CG_ROOT/$TEST_CG_SUB4 > + > +# Turn on misc and memory controller in non-leaf nodes > +echo "+misc" > $CG_ROOT/cgroup.subtree_control && \ > +echo "+memory" > $CG_ROOT/cgroup.subtree_control && \ > +echo "+misc" > $CG_ROOT/$TEST_ROOT_CG/cgroup.subtree_control && \ > +echo "+memory" > $CG_ROOT/$TEST_ROOT_CG/cgroup.subtree_control && \ > +echo "+misc" > $CG_ROOT/$TEST_CG_SUB1/cgroup.subtree_control > +if [ $? -ne 0 ]; then > + echo "# Failed setting up cgroups, make sure misc and memory cgroups are enabled." > + exit 1 > +fi > + > +CAPACITY=$(grep "sgx_epc" "$CG_ROOT/misc.capacity" | awk '{print $2}') > +# This is below number of VA pages needed for enclave of capacity size. So > +# should fail oversubscribed cases > +SMALL=$(( CAPACITY / 512 )) > + > +# At least load one enclave of capacity size successfully, maybe up to 4. > +# But some may fail if we run more than 4 concurrent enclaves of capacity size. > +LARGE=$(( SMALL * 4 )) > + > +# Load lots of enclaves > +LARGER=$CAPACITY > +echo "# Setting up limits." > +echo "sgx_epc $SMALL" > $CG_ROOT/$TEST_CG_SUB1/misc.max && \ > +echo "sgx_epc $LARGE" > $CG_ROOT/$TEST_CG_SUB2/misc.max && \ > +echo "sgx_epc $LARGER" > $CG_ROOT/$TEST_CG_SUB4/misc.max > +if [ $? -ne 0 ]; then > + echo "# Failed setting up misc limits." > + exit 1 > +fi > + > +clean_up() > +{ > + sleep 2 > + rmdir $CG_ROOT/$TEST_CG_SUB2 > + rmdir $CG_ROOT/$TEST_CG_SUB3 > + rmdir $CG_ROOT/$TEST_CG_SUB4 > + rmdir $CG_ROOT/$TEST_CG_SUB1 > + rmdir $CG_ROOT/$TEST_ROOT_CG > +} > + > +timestamp=$(date +%Y%m%d_%H%M%S) > + > +test_cmd="./test_sgx -t unclobbered_vdso_oversubscribed" > + > +PROCESS_SUCCESS=1 > +PROCESS_FAILURE=0 > + > +# Wait for a process and check for expected exit status. > +# > +# Arguments: > +# $1 - the pid of the process to wait and check. > +# $2 - 1 if expecting success, 0 for failure. > +# > +# Return: > +# 0 if the exit status of the process matches the expectation. > +# 1 otherwise. > +wait_check_process_status() { > + pid=$1 > + check_for_success=$2 > + > + wait "$pid" > + status=$? > + > + if [ $check_for_success -eq $PROCESS_SUCCESS ] && [ $status -eq 0 ]; then > + echo "# Process $pid succeeded." > + return 0 > + elif [ $check_for_success -eq $PROCESS_FAILURE ] && [ $status -ne 0 ]; then > + echo "# Process $pid returned failure." > + return 0 > + fi > + return 1 > +} > + > +# Wait for a set of processes and check for expected exit status > +# > +# Arguments: > +# $1 - 1 if expecting success, 0 for failure. > +# remaining args - The pids of the processes > +# > +# Return: > +# 0 if exit status of any process matches the expectation. > +# 1 otherwise. > +wait_and_detect_for_any() { > + check_for_success=$1 > + > + shift > + detected=1 # 0 for success detection > + > + for pid in $@; do > + if wait_check_process_status "$pid" "$check_for_success"; then > + detected=0 > + # Wait for other processes to exit > + fi > + done > + > + return $detected > +} > + > +echo "# Start unclobbered_vdso_oversubscribed with SMALL limit, expecting failure..." > +# Always use leaf node of misc cgroups > +# these may fail on OOM > +./ash_cgexec.sh $TEST_CG_SUB3 $test_cmd >cgtest_small_$timestamp.log 2>&1 > +if [ $? -eq 0 ]; then > + echo "# Fail on SMALL limit, not expecting any test passes." > + clean_up > + exit 1 > +else > + echo "# Test failed as expected." > +fi > + > +echo "# PASSED SMALL limit." > + > +echo "# Start 4 concurrent unclobbered_vdso_oversubscribed tests with LARGE limit, > + expecting at least one success...." > + > +pids="" > +for i in 1 2 3 4; do > + ( > + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >cgtest_large_positive_$timestamp.$i.log 2>&1 > + ) & > + pids="$pids $!" > +done > + > + > +if wait_and_detect_for_any $PROCESS_SUCCESS "$pids"; then > + echo "# PASSED LARGE limit positive testing." > +else > + echo "# Failed on LARGE limit positive testing, no test passes." > + clean_up > + exit 1 > +fi > + > +echo "# Start 5 concurrent unclobbered_vdso_oversubscribed tests with LARGE limit, > + expecting at least one failure...." > +pids="" > +for i in 1 2 3 4 5; do > + ( > + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >cgtest_large_negative_$timestamp.$i.log 2>&1 > + ) & > + pids="$pids $!" > +done > + > +if wait_and_detect_for_any $PROCESS_FAILURE "$pids"; then > + echo "# PASSED LARGE limit negative testing." > +else > + echo "# Failed on LARGE limit negative testing, no test fails." > + clean_up > + exit 1 > +fi > + > +echo "# Start 8 concurrent unclobbered_vdso_oversubscribed tests with LARGER limit, > + expecting no failure...." > +pids="" > +for i in 1 2 3 4 5 6 7 8; do > + ( > + ./ash_cgexec.sh $TEST_CG_SUB4 $test_cmd >cgtest_larger_$timestamp.$i.log 2>&1 > + ) & > + pids="$pids $!" > +done > + > +if wait_and_detect_for_any $PROCESS_FAILURE "$pids"; then > + echo "# Failed on LARGER limit, at least one test fails." > + clean_up > + exit 1 > +else > + echo "# PASSED LARGER limit tests." > +fi > + > +echo "# Start 8 concurrent unclobbered_vdso_oversubscribed tests with LARGER limit, > + randomly kill one, expecting no failure...." > +pids="" > +for i in 1 2 3 4 5 6 7 8; do > + ( > + ./ash_cgexec.sh $TEST_CG_SUB4 $test_cmd >cgtest_larger_kill_$timestamp.$i.log 2>&1 > + ) & > + pids="$pids $!" > +done > +random_number=$(awk 'BEGIN{srand();print int(rand()*5)}') > +sleep $((random_number + 1)) > + > +# Randomly select a process to kill > +# Make sure usage counter not leaked at the end. > +RANDOM_INDEX=$(awk 'BEGIN{srand();print int(rand()*8)}') > +counter=0 > +for pid in $pids; do > + if [ "$counter" -eq "$RANDOM_INDEX" ]; then > + PID_TO_KILL=$pid > + break > + fi > + counter=$((counter + 1)) > +done > + > +kill $PID_TO_KILL > +echo "# Killed process with PID: $PID_TO_KILL" > + > +any_failure=0 > +for pid in $pids; do > + wait "$pid" > + status=$? > + if [ "$pid" != "$PID_TO_KILL" ]; then > + if [ $status -ne 0 ]; then > + echo "# Process $pid returned failure." > + any_failure=1 > + fi > + fi > +done > + > +if [ $any_failure -ne 0 ]; then > + echo "# Failed on random killing, at least one test fails." > + clean_up > + exit 1 > +fi > +echo "# PASSED LARGER limit test with a process randomly killed." > + > +MEM_LIMIT_TOO_SMALL=$((CAPACITY - 2 * LARGE)) > + > +echo "$MEM_LIMIT_TOO_SMALL" > $CG_ROOT/$TEST_CG_SUB2/memory.max > +if [ $? -ne 0 ]; then > + echo "# Failed creating memory controller." > + clean_up > + exit 1 > +fi > + > +echo "# Start 4 concurrent unclobbered_vdso_oversubscribed tests with LARGE EPC limit, > + and too small RAM limit, expecting all failures...." > +# Ensure swapping off so the OOM killer is activated when mem_cgroup limit is hit. > +swapoff -a > +pids="" > +for i in 1 2 3 4; do > + ( > + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >cgtest_large_oom_$timestamp.$i.log 2>&1 > + ) & > + pids="$pids $!" > +done > + > +if wait_and_detect_for_any $PROCESS_SUCCESS "$pids"; then > + echo "# Failed on tests with memcontrol, some tests did not fail." > + clean_up > + swapon -a > + exit 1 > +else > + swapon -a > + echo "# PASSED LARGE limit tests with memcontrol." > +fi > + > +sleep 2 > + > +USAGE=$(grep '^sgx_epc' "$CG_ROOT/$TEST_ROOT_CG/misc.current" | awk '{print $2}') > +if [ "$USAGE" -ne 0 ]; then > + echo "# Failed: Final usage is $USAGE, not 0." > +else > + echo "# PASSED leakage check." > + echo "# PASSED ALL cgroup limit tests, cleanup cgroups..." > +fi > +clean_up > +echo "# done." > diff --git a/tools/testing/selftests/sgx/watch_misc_for_tests.sh b/tools/testing/selftests/sgx/watch_misc_for_tests.sh > new file mode 100755 > index 000000000000..1c9985726ace > --- /dev/null > +++ b/tools/testing/selftests/sgx/watch_misc_for_tests.sh > @@ -0,0 +1,11 @@ > +#!/usr/bin/env sh > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright(c) 2023, 2024 Intel Corporation. > + > +if [ -z "$1" ]; then > + echo "No argument supplied, please provide 'max', 'current', or 'events'" > + exit 1 > +fi > + > +watch -n 1 'find /sys/fs/cgroup -wholename "*/test*/misc.'$1'" -exec \ > + sh -c '\''echo "$1:"; cat "$1"'\'' _ {} \;' I'll compile the kernel with this and see what happens! Have you tried to run the test suite from top-level? This is just a sanity check. I've few times forgot to do this so thus asking :-) BR, Jarkko
On Wed Apr 10, 2024 at 9:25 PM EEST, Haitao Huang wrote: > To run selftests for EPC cgroup: > > sudo ./run_epc_cg_selftests.sh > > To watch misc cgroup 'current' changes during testing, run this in a > separate terminal: > > ./watch_misc_for_tests.sh current > > With different cgroups, the script starts one or multiple concurrent SGX > selftests (test_sgx), each to run the unclobbered_vdso_oversubscribed > test case, which loads an enclave of EPC size equal to the EPC capacity > available on the platform. The script checks results against the > expectation set for each cgroup and reports success or failure. > > The script creates 3 different cgroups at the beginning with following > expectations: > > 1) SMALL - intentionally small enough to fail the test loading an > enclave of size equal to the capacity. > 2) LARGE - large enough to run up to 4 concurrent tests but fail some if > more than 4 concurrent tests are run. The script starts 4 expecting at > least one test to pass, and then starts 5 expecting at least one test > to fail. > 3) LARGER - limit is the same as the capacity, large enough to run lots of > concurrent tests. The script starts 8 of them and expects all pass. > Then it reruns the same test with one process randomly killed and > usage checked to be zero after all processes exit. > > The script also includes a test with low mem_cg limit and LARGE sgx_epc > limit to verify that the RAM used for per-cgroup reclamation is charged > to a proper mem_cg. For this test, it turns off swapping before start, > and turns swapping back on afterwards. > > Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> > --- > V11: > - Remove cgroups-tools dependency and make scripts ash compatible. (Jarkko) > - Drop support for cgroup v1 and simplify. (Michal, Jarkko) > - Add documentation for functions. (Jarkko) > - Turn off swapping before memcontrol tests and back on after > - Format and style fixes, name for hard coded values > > V7: > - Added memcontrol test. > > V5: > - Added script with automatic results checking, remove the interactive > script. > - The script can run independent from the series below. > --- > tools/testing/selftests/sgx/ash_cgexec.sh | 16 + > .../selftests/sgx/run_epc_cg_selftests.sh | 275 ++++++++++++++++++ > .../selftests/sgx/watch_misc_for_tests.sh | 11 + > 3 files changed, 302 insertions(+) > create mode 100755 tools/testing/selftests/sgx/ash_cgexec.sh > create mode 100755 tools/testing/selftests/sgx/run_epc_cg_selftests.sh > create mode 100755 tools/testing/selftests/sgx/watch_misc_for_tests.sh > > diff --git a/tools/testing/selftests/sgx/ash_cgexec.sh b/tools/testing/selftests/sgx/ash_cgexec.sh > new file mode 100755 > index 000000000000..cfa5d2b0e795 > --- /dev/null > +++ b/tools/testing/selftests/sgx/ash_cgexec.sh > @@ -0,0 +1,16 @@ > +#!/usr/bin/env sh > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright(c) 2024 Intel Corporation. > + > +# Start a program in a given cgroup. > +# Supports V2 cgroup paths, relative to /sys/fs/cgroup > +if [ "$#" -lt 2 ]; then > + echo "Usage: $0 <v2 cgroup path> <command> [args...]" > + exit 1 > +fi > +# Move this shell to the cgroup. > +echo 0 >/sys/fs/cgroup/$1/cgroup.procs > +shift > +# Execute the command within the cgroup > +exec "$@" > + > diff --git a/tools/testing/selftests/sgx/run_epc_cg_selftests.sh b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh > new file mode 100755 > index 000000000000..dd56273056fc > --- /dev/null > +++ b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh > @@ -0,0 +1,275 @@ > +#!/usr/bin/env sh > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright(c) 2023, 2024 Intel Corporation. > + > +TEST_ROOT_CG=selftest > +TEST_CG_SUB1=$TEST_ROOT_CG/test1 > +TEST_CG_SUB2=$TEST_ROOT_CG/test2 > +# We will only set limit in test1 and run tests in test3 > +TEST_CG_SUB3=$TEST_ROOT_CG/test1/test3 > +TEST_CG_SUB4=$TEST_ROOT_CG/test4 > + > +# Cgroup v2 only > +CG_ROOT=/sys/fs/cgroup > +mkdir -p $CG_ROOT/$TEST_CG_SUB1 > +mkdir -p $CG_ROOT/$TEST_CG_SUB2 > +mkdir -p $CG_ROOT/$TEST_CG_SUB3 > +mkdir -p $CG_ROOT/$TEST_CG_SUB4 > + > +# Turn on misc and memory controller in non-leaf nodes > +echo "+misc" > $CG_ROOT/cgroup.subtree_control && \ > +echo "+memory" > $CG_ROOT/cgroup.subtree_control && \ > +echo "+misc" > $CG_ROOT/$TEST_ROOT_CG/cgroup.subtree_control && \ > +echo "+memory" > $CG_ROOT/$TEST_ROOT_CG/cgroup.subtree_control && \ > +echo "+misc" > $CG_ROOT/$TEST_CG_SUB1/cgroup.subtree_control > +if [ $? -ne 0 ]; then > + echo "# Failed setting up cgroups, make sure misc and memory cgroups are enabled." > + exit 1 > +fi > + > +CAPACITY=$(grep "sgx_epc" "$CG_ROOT/misc.capacity" | awk '{print $2}') > +# This is below number of VA pages needed for enclave of capacity size. So > +# should fail oversubscribed cases > +SMALL=$(( CAPACITY / 512 )) > + > +# At least load one enclave of capacity size successfully, maybe up to 4. > +# But some may fail if we run more than 4 concurrent enclaves of capacity size. > +LARGE=$(( SMALL * 4 )) > + > +# Load lots of enclaves > +LARGER=$CAPACITY > +echo "# Setting up limits." > +echo "sgx_epc $SMALL" > $CG_ROOT/$TEST_CG_SUB1/misc.max && \ > +echo "sgx_epc $LARGE" > $CG_ROOT/$TEST_CG_SUB2/misc.max && \ > +echo "sgx_epc $LARGER" > $CG_ROOT/$TEST_CG_SUB4/misc.max > +if [ $? -ne 0 ]; then > + echo "# Failed setting up misc limits." > + exit 1 > +fi > + > +clean_up() > +{ > + sleep 2 > + rmdir $CG_ROOT/$TEST_CG_SUB2 > + rmdir $CG_ROOT/$TEST_CG_SUB3 > + rmdir $CG_ROOT/$TEST_CG_SUB4 > + rmdir $CG_ROOT/$TEST_CG_SUB1 > + rmdir $CG_ROOT/$TEST_ROOT_CG > +} > + > +timestamp=$(date +%Y%m%d_%H%M%S) > + > +test_cmd="./test_sgx -t unclobbered_vdso_oversubscribed" > + > +PROCESS_SUCCESS=1 > +PROCESS_FAILURE=0 > + > +# Wait for a process and check for expected exit status. > +# > +# Arguments: > +# $1 - the pid of the process to wait and check. > +# $2 - 1 if expecting success, 0 for failure. > +# > +# Return: > +# 0 if the exit status of the process matches the expectation. > +# 1 otherwise. > +wait_check_process_status() { > + pid=$1 > + check_for_success=$2 > + > + wait "$pid" > + status=$? > + > + if [ $check_for_success -eq $PROCESS_SUCCESS ] && [ $status -eq 0 ]; then > + echo "# Process $pid succeeded." > + return 0 > + elif [ $check_for_success -eq $PROCESS_FAILURE ] && [ $status -ne 0 ]; then > + echo "# Process $pid returned failure." > + return 0 > + fi > + return 1 > +} > + > +# Wait for a set of processes and check for expected exit status > +# > +# Arguments: > +# $1 - 1 if expecting success, 0 for failure. > +# remaining args - The pids of the processes > +# > +# Return: > +# 0 if exit status of any process matches the expectation. > +# 1 otherwise. > +wait_and_detect_for_any() { > + check_for_success=$1 > + > + shift > + detected=1 # 0 for success detection > + > + for pid in $@; do > + if wait_check_process_status "$pid" "$check_for_success"; then > + detected=0 > + # Wait for other processes to exit > + fi > + done > + > + return $detected > +} > + > +echo "# Start unclobbered_vdso_oversubscribed with SMALL limit, expecting failure..." > +# Always use leaf node of misc cgroups > +# these may fail on OOM > +./ash_cgexec.sh $TEST_CG_SUB3 $test_cmd >cgtest_small_$timestamp.log 2>&1 > +if [ $? -eq 0 ]; then > + echo "# Fail on SMALL limit, not expecting any test passes." > + clean_up > + exit 1 > +else > + echo "# Test failed as expected." > +fi > + > +echo "# PASSED SMALL limit." > + > +echo "# Start 4 concurrent unclobbered_vdso_oversubscribed tests with LARGE limit, > + expecting at least one success...." > + > +pids="" > +for i in 1 2 3 4; do > + ( > + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >cgtest_large_positive_$timestamp.$i.log 2>&1 > + ) & > + pids="$pids $!" > +done > + > + > +if wait_and_detect_for_any $PROCESS_SUCCESS "$pids"; then > + echo "# PASSED LARGE limit positive testing." > +else > + echo "# Failed on LARGE limit positive testing, no test passes." > + clean_up > + exit 1 > +fi > + > +echo "# Start 5 concurrent unclobbered_vdso_oversubscribed tests with LARGE limit, > + expecting at least one failure...." > +pids="" > +for i in 1 2 3 4 5; do > + ( > + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >cgtest_large_negative_$timestamp.$i.log 2>&1 > + ) & > + pids="$pids $!" > +done > + > +if wait_and_detect_for_any $PROCESS_FAILURE "$pids"; then > + echo "# PASSED LARGE limit negative testing." > +else > + echo "# Failed on LARGE limit negative testing, no test fails." > + clean_up > + exit 1 > +fi > + > +echo "# Start 8 concurrent unclobbered_vdso_oversubscribed tests with LARGER limit, > + expecting no failure...." > +pids="" > +for i in 1 2 3 4 5 6 7 8; do > + ( > + ./ash_cgexec.sh $TEST_CG_SUB4 $test_cmd >cgtest_larger_$timestamp.$i.log 2>&1 > + ) & > + pids="$pids $!" > +done > + > +if wait_and_detect_for_any $PROCESS_FAILURE "$pids"; then > + echo "# Failed on LARGER limit, at least one test fails." > + clean_up > + exit 1 > +else > + echo "# PASSED LARGER limit tests." > +fi > + > +echo "# Start 8 concurrent unclobbered_vdso_oversubscribed tests with LARGER limit, > + randomly kill one, expecting no failure...." > +pids="" > +for i in 1 2 3 4 5 6 7 8; do > + ( > + ./ash_cgexec.sh $TEST_CG_SUB4 $test_cmd >cgtest_larger_kill_$timestamp.$i.log 2>&1 > + ) & > + pids="$pids $!" > +done > +random_number=$(awk 'BEGIN{srand();print int(rand()*5)}') > +sleep $((random_number + 1)) > + > +# Randomly select a process to kill > +# Make sure usage counter not leaked at the end. > +RANDOM_INDEX=$(awk 'BEGIN{srand();print int(rand()*8)}') > +counter=0 > +for pid in $pids; do > + if [ "$counter" -eq "$RANDOM_INDEX" ]; then > + PID_TO_KILL=$pid > + break > + fi > + counter=$((counter + 1)) > +done > + > +kill $PID_TO_KILL > +echo "# Killed process with PID: $PID_TO_KILL" > + > +any_failure=0 > +for pid in $pids; do > + wait "$pid" > + status=$? > + if [ "$pid" != "$PID_TO_KILL" ]; then > + if [ $status -ne 0 ]; then > + echo "# Process $pid returned failure." > + any_failure=1 > + fi > + fi > +done > + > +if [ $any_failure -ne 0 ]; then > + echo "# Failed on random killing, at least one test fails." > + clean_up > + exit 1 > +fi > +echo "# PASSED LARGER limit test with a process randomly killed." > + > +MEM_LIMIT_TOO_SMALL=$((CAPACITY - 2 * LARGE)) > + > +echo "$MEM_LIMIT_TOO_SMALL" > $CG_ROOT/$TEST_CG_SUB2/memory.max > +if [ $? -ne 0 ]; then > + echo "# Failed creating memory controller." > + clean_up > + exit 1 > +fi > + > +echo "# Start 4 concurrent unclobbered_vdso_oversubscribed tests with LARGE EPC limit, > + and too small RAM limit, expecting all failures...." > +# Ensure swapping off so the OOM killer is activated when mem_cgroup limit is hit. > +swapoff -a > +pids="" > +for i in 1 2 3 4; do > + ( > + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >cgtest_large_oom_$timestamp.$i.log 2>&1 > + ) & > + pids="$pids $!" > +done > + > +if wait_and_detect_for_any $PROCESS_SUCCESS "$pids"; then > + echo "# Failed on tests with memcontrol, some tests did not fail." > + clean_up > + swapon -a > + exit 1 > +else > + swapon -a > + echo "# PASSED LARGE limit tests with memcontrol." > +fi > + > +sleep 2 > + > +USAGE=$(grep '^sgx_epc' "$CG_ROOT/$TEST_ROOT_CG/misc.current" | awk '{print $2}') > +if [ "$USAGE" -ne 0 ]; then > + echo "# Failed: Final usage is $USAGE, not 0." > +else > + echo "# PASSED leakage check." > + echo "# PASSED ALL cgroup limit tests, cleanup cgroups..." > +fi > +clean_up > +echo "# done." > diff --git a/tools/testing/selftests/sgx/watch_misc_for_tests.sh b/tools/testing/selftests/sgx/watch_misc_for_tests.sh > new file mode 100755 > index 000000000000..1c9985726ace > --- /dev/null > +++ b/tools/testing/selftests/sgx/watch_misc_for_tests.sh > @@ -0,0 +1,11 @@ > +#!/usr/bin/env sh > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright(c) 2023, 2024 Intel Corporation. > + > +if [ -z "$1" ]; then > + echo "No argument supplied, please provide 'max', 'current', or 'events'" > + exit 1 > +fi > + > +watch -n 1 'find /sys/fs/cgroup -wholename "*/test*/misc.'$1'" -exec \ > + sh -c '\''echo "$1:"; cat "$1"'\'' _ {} \;' So this is what happens now: jarkko@mustatorvisieni:~/linux-tpmdd> make -C tools/testing/selftests/sgx run_tests make: Entering directory '/home/jarkko/linux-tpmdd/tools/testing/selftests/sgx' TAP version 13 1..1 # timeout set to 45 # selftests: sgx: test_sgx # TAP version 13 # 1..16 # # Starting 16 tests from 1 test cases. # # RUN enclave.unclobbered_vdso ... # # OK enclave.unclobbered_vdso # ok 1 enclave.unclobbered_vdso # # RUN enclave.unclobbered_vdso_oversubscribed ... # # OK enclave.unclobbered_vdso_oversubscribed # ok 2 enclave.unclobbered_vdso_oversubscribed # # RUN enclave.unclobbered_vdso_oversubscribed_remove ... # # main.c:402:unclobbered_vdso_oversubscribed_remove:Creating an enclave with 98566144 bytes heap may take a while ... # # main.c:457:unclobbered_vdso_oversubscribed_remove:Changing type of 98566144 bytes to trimmed may take a while ... # # main.c:473:unclobbered_vdso_oversubscribed_remove:Entering enclave to run EACCEPT for each page of 98566144 bytes may take a while ... # # main.c:494:unclobbered_vdso_oversubscribed_remove:Removing 98566144 bytes from enclave may take a while ... # # OK enclave.unclobbered_vdso_oversubscribed_remove # ok 3 enclave.unclobbered_vdso_oversubscribed_remove # # RUN enclave.clobbered_vdso ... # # OK enclave.clobbered_vdso # ok 4 enclave.clobbered_vdso # # RUN enclave.clobbered_vdso_and_user_function ... # # OK enclave.clobbered_vdso_and_user_function # ok 5 enclave.clobbered_vdso_and_user_function # # RUN enclave.tcs_entry ... # # OK enclave.tcs_entry # ok 6 enclave.tcs_entry # # RUN enclave.pte_permissions ... # # OK enclave.pte_permissions # ok 7 enclave.pte_permissions # # RUN enclave.tcs_permissions ... # # OK enclave.tcs_permissions # ok 8 enclave.tcs_permissions # # RUN enclave.epcm_permissions ... # # OK enclave.epcm_permissions # ok 9 enclave.epcm_permissions # # RUN enclave.augment ... # # OK enclave.augment # ok 10 enclave.augment # # RUN enclave.augment_via_eaccept ... # # OK enclave.augment_via_eaccept # ok 11 enclave.augment_via_eaccept # # RUN enclave.tcs_create ... # # OK enclave.tcs_create # ok 12 enclave.tcs_create # # RUN enclave.remove_added_page_no_eaccept ... # # OK enclave.remove_added_page_no_eaccept # ok 13 enclave.remove_added_page_no_eaccept # # RUN enclave.remove_added_page_invalid_access ... # # OK enclave.remove_added_page_invalid_access # ok 14 enclave.remove_added_page_invalid_access # # RUN enclave.remove_added_page_invalid_access_after_eaccept ... # # OK enclave.remove_added_page_invalid_access_after_eaccept # ok 15 enclave.remove_added_page_invalid_access_after_eaccept # # RUN enclave.remove_untouched_page ... # # OK enclave.remove_untouched_page # ok 16 enclave.remove_untouched_page # # PASSED: 16 / 16 tests passed. # # Totals: pass:16 fail:0 xfail:0 xpass:0 skip:0 error:0 ok 1 selftests: sgx: test_sgx Also cgroups tests are expected to run when invoking "run_tests". I also wonder do we really want this: diff --git a/config/x86_64/default b/config/x86_64/default index 246b1f3df46..6c3d20af7b4 100644 --- a/config/x86_64/default +++ b/config/x86_64/default @@ -502,6 +502,7 @@ CONFIG_X86_INTEL_TSX_MODE_OFF=y # CONFIG_X86_INTEL_TSX_MODE_ON is not set # CONFIG_X86_INTEL_TSX_MODE_AUTO is not set CONFIG_X86_SGX=y +CONFIG_CGROUP_SGX_EPC=y # CONFIG_X86_USER_SHADOW_STACK is not set CONFIG_EFI=y CONFIG_EFI_STUB=y It is a small change but affects all of downstream, not just OpenSUSE. I hard time projecting a situation where you wanted SGX but without cgroups support so perhaps it would be a better idea to enable cgroups unconditionally when SGX and cgroups are part of the kernel config? Then downstream can just pick the patches and call it a day... BR, Jarkko
On Sun, 14 Apr 2024 10:01:03 -0500, Jarkko Sakkinen <jarkko@kernel.org> wrote: > On Wed Apr 10, 2024 at 9:25 PM EEST, Haitao Huang wrote: >> To run selftests for EPC cgroup: >> >> sudo ./run_epc_cg_selftests.sh >> >> To watch misc cgroup 'current' changes during testing, run this in a >> separate terminal: >> >> ./watch_misc_for_tests.sh current >> >> With different cgroups, the script starts one or multiple concurrent SGX >> selftests (test_sgx), each to run the unclobbered_vdso_oversubscribed >> test case, which loads an enclave of EPC size equal to the EPC capacity >> available on the platform. The script checks results against the >> expectation set for each cgroup and reports success or failure. >> >> The script creates 3 different cgroups at the beginning with following >> expectations: >> >> 1) SMALL - intentionally small enough to fail the test loading an >> enclave of size equal to the capacity. >> 2) LARGE - large enough to run up to 4 concurrent tests but fail some if >> more than 4 concurrent tests are run. The script starts 4 expecting at >> least one test to pass, and then starts 5 expecting at least one test >> to fail. >> 3) LARGER - limit is the same as the capacity, large enough to run lots >> of >> concurrent tests. The script starts 8 of them and expects all pass. >> Then it reruns the same test with one process randomly killed and >> usage checked to be zero after all processes exit. >> >> The script also includes a test with low mem_cg limit and LARGE sgx_epc >> limit to verify that the RAM used for per-cgroup reclamation is charged >> to a proper mem_cg. For this test, it turns off swapping before start, >> and turns swapping back on afterwards. >> >> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> >> --- >> V11: >> - Remove cgroups-tools dependency and make scripts ash compatible. >> (Jarkko) >> - Drop support for cgroup v1 and simplify. (Michal, Jarkko) >> - Add documentation for functions. (Jarkko) >> - Turn off swapping before memcontrol tests and back on after >> - Format and style fixes, name for hard coded values >> >> V7: >> - Added memcontrol test. >> >> V5: >> - Added script with automatic results checking, remove the interactive >> script. >> - The script can run independent from the series below. >> --- >> tools/testing/selftests/sgx/ash_cgexec.sh | 16 + >> .../selftests/sgx/run_epc_cg_selftests.sh | 275 ++++++++++++++++++ >> .../selftests/sgx/watch_misc_for_tests.sh | 11 + >> 3 files changed, 302 insertions(+) >> create mode 100755 tools/testing/selftests/sgx/ash_cgexec.sh >> create mode 100755 tools/testing/selftests/sgx/run_epc_cg_selftests.sh >> create mode 100755 tools/testing/selftests/sgx/watch_misc_for_tests.sh >> >> diff --git a/tools/testing/selftests/sgx/ash_cgexec.sh >> b/tools/testing/selftests/sgx/ash_cgexec.sh >> new file mode 100755 >> index 000000000000..cfa5d2b0e795 >> --- /dev/null >> +++ b/tools/testing/selftests/sgx/ash_cgexec.sh >> @@ -0,0 +1,16 @@ >> +#!/usr/bin/env sh >> +# SPDX-License-Identifier: GPL-2.0 >> +# Copyright(c) 2024 Intel Corporation. >> + >> +# Start a program in a given cgroup. >> +# Supports V2 cgroup paths, relative to /sys/fs/cgroup >> +if [ "$#" -lt 2 ]; then >> + echo "Usage: $0 <v2 cgroup path> <command> [args...]" >> + exit 1 >> +fi >> +# Move this shell to the cgroup. >> +echo 0 >/sys/fs/cgroup/$1/cgroup.procs >> +shift >> +# Execute the command within the cgroup >> +exec "$@" >> + >> diff --git a/tools/testing/selftests/sgx/run_epc_cg_selftests.sh >> b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh >> new file mode 100755 >> index 000000000000..dd56273056fc >> --- /dev/null >> +++ b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh >> @@ -0,0 +1,275 @@ >> +#!/usr/bin/env sh >> +# SPDX-License-Identifier: GPL-2.0 >> +# Copyright(c) 2023, 2024 Intel Corporation. >> + >> +TEST_ROOT_CG=selftest >> +TEST_CG_SUB1=$TEST_ROOT_CG/test1 >> +TEST_CG_SUB2=$TEST_ROOT_CG/test2 >> +# We will only set limit in test1 and run tests in test3 >> +TEST_CG_SUB3=$TEST_ROOT_CG/test1/test3 >> +TEST_CG_SUB4=$TEST_ROOT_CG/test4 >> + >> +# Cgroup v2 only >> +CG_ROOT=/sys/fs/cgroup >> +mkdir -p $CG_ROOT/$TEST_CG_SUB1 >> +mkdir -p $CG_ROOT/$TEST_CG_SUB2 >> +mkdir -p $CG_ROOT/$TEST_CG_SUB3 >> +mkdir -p $CG_ROOT/$TEST_CG_SUB4 >> + >> +# Turn on misc and memory controller in non-leaf nodes >> +echo "+misc" > $CG_ROOT/cgroup.subtree_control && \ >> +echo "+memory" > $CG_ROOT/cgroup.subtree_control && \ >> +echo "+misc" > $CG_ROOT/$TEST_ROOT_CG/cgroup.subtree_control && \ >> +echo "+memory" > $CG_ROOT/$TEST_ROOT_CG/cgroup.subtree_control && \ >> +echo "+misc" > $CG_ROOT/$TEST_CG_SUB1/cgroup.subtree_control >> +if [ $? -ne 0 ]; then >> + echo "# Failed setting up cgroups, make sure misc and memory >> cgroups are enabled." >> + exit 1 >> +fi >> + >> +CAPACITY=$(grep "sgx_epc" "$CG_ROOT/misc.capacity" | awk '{print $2}') >> +# This is below number of VA pages needed for enclave of capacity >> size. So >> +# should fail oversubscribed cases >> +SMALL=$(( CAPACITY / 512 )) >> + >> +# At least load one enclave of capacity size successfully, maybe up to >> 4. >> +# But some may fail if we run more than 4 concurrent enclaves of >> capacity size. >> +LARGE=$(( SMALL * 4 )) >> + >> +# Load lots of enclaves >> +LARGER=$CAPACITY >> +echo "# Setting up limits." >> +echo "sgx_epc $SMALL" > $CG_ROOT/$TEST_CG_SUB1/misc.max && \ >> +echo "sgx_epc $LARGE" > $CG_ROOT/$TEST_CG_SUB2/misc.max && \ >> +echo "sgx_epc $LARGER" > $CG_ROOT/$TEST_CG_SUB4/misc.max >> +if [ $? -ne 0 ]; then >> + echo "# Failed setting up misc limits." >> + exit 1 >> +fi >> + >> +clean_up() >> +{ >> + sleep 2 >> + rmdir $CG_ROOT/$TEST_CG_SUB2 >> + rmdir $CG_ROOT/$TEST_CG_SUB3 >> + rmdir $CG_ROOT/$TEST_CG_SUB4 >> + rmdir $CG_ROOT/$TEST_CG_SUB1 >> + rmdir $CG_ROOT/$TEST_ROOT_CG >> +} >> + >> +timestamp=$(date +%Y%m%d_%H%M%S) >> + >> +test_cmd="./test_sgx -t unclobbered_vdso_oversubscribed" >> + >> +PROCESS_SUCCESS=1 >> +PROCESS_FAILURE=0 >> + >> +# Wait for a process and check for expected exit status. >> +# >> +# Arguments: >> +# $1 - the pid of the process to wait and check. >> +# $2 - 1 if expecting success, 0 for failure. >> +# >> +# Return: >> +# 0 if the exit status of the process matches the expectation. >> +# 1 otherwise. >> +wait_check_process_status() { >> + pid=$1 >> + check_for_success=$2 >> + >> + wait "$pid" >> + status=$? >> + >> + if [ $check_for_success -eq $PROCESS_SUCCESS ] && [ $status -eq 0 >> ]; then >> + echo "# Process $pid succeeded." >> + return 0 >> + elif [ $check_for_success -eq $PROCESS_FAILURE ] && [ $status -ne >> 0 ]; then >> + echo "# Process $pid returned failure." >> + return 0 >> + fi >> + return 1 >> +} >> + >> +# Wait for a set of processes and check for expected exit status >> +# >> +# Arguments: >> +# $1 - 1 if expecting success, 0 for failure. >> +# remaining args - The pids of the processes >> +# >> +# Return: >> +# 0 if exit status of any process matches the expectation. >> +# 1 otherwise. >> +wait_and_detect_for_any() { >> + check_for_success=$1 >> + >> + shift >> + detected=1 # 0 for success detection >> + >> + for pid in $@; do >> + if wait_check_process_status "$pid" "$check_for_success"; then >> + detected=0 >> + # Wait for other processes to exit >> + fi >> + done >> + >> + return $detected >> +} >> + >> +echo "# Start unclobbered_vdso_oversubscribed with SMALL limit, >> expecting failure..." >> +# Always use leaf node of misc cgroups >> +# these may fail on OOM >> +./ash_cgexec.sh $TEST_CG_SUB3 $test_cmd >cgtest_small_$timestamp.log >> 2>&1 >> +if [ $? -eq 0 ]; then >> + echo "# Fail on SMALL limit, not expecting any test passes." >> + clean_up >> + exit 1 >> +else >> + echo "# Test failed as expected." >> +fi >> + >> +echo "# PASSED SMALL limit." >> + >> +echo "# Start 4 concurrent unclobbered_vdso_oversubscribed tests with >> LARGE limit, >> + expecting at least one success...." >> + >> +pids="" >> +for i in 1 2 3 4; do >> + ( >> + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >> >cgtest_large_positive_$timestamp.$i.log 2>&1 >> + ) & >> + pids="$pids $!" >> +done >> + >> + >> +if wait_and_detect_for_any $PROCESS_SUCCESS "$pids"; then >> + echo "# PASSED LARGE limit positive testing." >> +else >> + echo "# Failed on LARGE limit positive testing, no test passes." >> + clean_up >> + exit 1 >> +fi >> + >> +echo "# Start 5 concurrent unclobbered_vdso_oversubscribed tests with >> LARGE limit, >> + expecting at least one failure...." >> +pids="" >> +for i in 1 2 3 4 5; do >> + ( >> + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >> >cgtest_large_negative_$timestamp.$i.log 2>&1 >> + ) & >> + pids="$pids $!" >> +done >> + >> +if wait_and_detect_for_any $PROCESS_FAILURE "$pids"; then >> + echo "# PASSED LARGE limit negative testing." >> +else >> + echo "# Failed on LARGE limit negative testing, no test fails." >> + clean_up >> + exit 1 >> +fi >> + >> +echo "# Start 8 concurrent unclobbered_vdso_oversubscribed tests with >> LARGER limit, >> + expecting no failure...." >> +pids="" >> +for i in 1 2 3 4 5 6 7 8; do >> + ( >> + ./ash_cgexec.sh $TEST_CG_SUB4 $test_cmd >> >cgtest_larger_$timestamp.$i.log 2>&1 >> + ) & >> + pids="$pids $!" >> +done >> + >> +if wait_and_detect_for_any $PROCESS_FAILURE "$pids"; then >> + echo "# Failed on LARGER limit, at least one test fails." >> + clean_up >> + exit 1 >> +else >> + echo "# PASSED LARGER limit tests." >> +fi >> + >> +echo "# Start 8 concurrent unclobbered_vdso_oversubscribed tests with >> LARGER limit, >> + randomly kill one, expecting no failure...." >> +pids="" >> +for i in 1 2 3 4 5 6 7 8; do >> + ( >> + ./ash_cgexec.sh $TEST_CG_SUB4 $test_cmd >> >cgtest_larger_kill_$timestamp.$i.log 2>&1 >> + ) & >> + pids="$pids $!" >> +done >> +random_number=$(awk 'BEGIN{srand();print int(rand()*5)}') >> +sleep $((random_number + 1)) >> + >> +# Randomly select a process to kill >> +# Make sure usage counter not leaked at the end. >> +RANDOM_INDEX=$(awk 'BEGIN{srand();print int(rand()*8)}') >> +counter=0 >> +for pid in $pids; do >> + if [ "$counter" -eq "$RANDOM_INDEX" ]; then >> + PID_TO_KILL=$pid >> + break >> + fi >> + counter=$((counter + 1)) >> +done >> + >> +kill $PID_TO_KILL >> +echo "# Killed process with PID: $PID_TO_KILL" >> + >> +any_failure=0 >> +for pid in $pids; do >> + wait "$pid" >> + status=$? >> + if [ "$pid" != "$PID_TO_KILL" ]; then >> + if [ $status -ne 0 ]; then >> + echo "# Process $pid returned failure." >> + any_failure=1 >> + fi >> + fi >> +done >> + >> +if [ $any_failure -ne 0 ]; then >> + echo "# Failed on random killing, at least one test fails." >> + clean_up >> + exit 1 >> +fi >> +echo "# PASSED LARGER limit test with a process randomly killed." >> + >> +MEM_LIMIT_TOO_SMALL=$((CAPACITY - 2 * LARGE)) >> + >> +echo "$MEM_LIMIT_TOO_SMALL" > $CG_ROOT/$TEST_CG_SUB2/memory.max >> +if [ $? -ne 0 ]; then >> + echo "# Failed creating memory controller." >> + clean_up >> + exit 1 >> +fi >> + >> +echo "# Start 4 concurrent unclobbered_vdso_oversubscribed tests with >> LARGE EPC limit, >> + and too small RAM limit, expecting all failures...." >> +# Ensure swapping off so the OOM killer is activated when mem_cgroup >> limit is hit. >> +swapoff -a >> +pids="" >> +for i in 1 2 3 4; do >> + ( >> + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >> >cgtest_large_oom_$timestamp.$i.log 2>&1 >> + ) & >> + pids="$pids $!" >> +done >> + >> +if wait_and_detect_for_any $PROCESS_SUCCESS "$pids"; then >> + echo "# Failed on tests with memcontrol, some tests did not fail." >> + clean_up >> + swapon -a >> + exit 1 >> +else >> + swapon -a >> + echo "# PASSED LARGE limit tests with memcontrol." >> +fi >> + >> +sleep 2 >> + >> +USAGE=$(grep '^sgx_epc' "$CG_ROOT/$TEST_ROOT_CG/misc.current" | awk >> '{print $2}') >> +if [ "$USAGE" -ne 0 ]; then >> + echo "# Failed: Final usage is $USAGE, not 0." >> +else >> + echo "# PASSED leakage check." >> + echo "# PASSED ALL cgroup limit tests, cleanup cgroups..." >> +fi >> +clean_up >> +echo "# done." >> diff --git a/tools/testing/selftests/sgx/watch_misc_for_tests.sh >> b/tools/testing/selftests/sgx/watch_misc_for_tests.sh >> new file mode 100755 >> index 000000000000..1c9985726ace >> --- /dev/null >> +++ b/tools/testing/selftests/sgx/watch_misc_for_tests.sh >> @@ -0,0 +1,11 @@ >> +#!/usr/bin/env sh >> +# SPDX-License-Identifier: GPL-2.0 >> +# Copyright(c) 2023, 2024 Intel Corporation. >> + >> +if [ -z "$1" ]; then >> + echo "No argument supplied, please provide 'max', 'current', or >> 'events'" >> + exit 1 >> +fi >> + >> +watch -n 1 'find /sys/fs/cgroup -wholename "*/test*/misc.'$1'" -exec \ >> + sh -c '\''echo "$1:"; cat "$1"'\'' _ {} \;' > > So this is what happens now: > > jarkko@mustatorvisieni:~/linux-tpmdd> make -C > tools/testing/selftests/sgx run_tests > make: Entering directory > '/home/jarkko/linux-tpmdd/tools/testing/selftests/sgx' > TAP version 13 > 1..1 > # timeout set to 45 > # selftests: sgx: test_sgx > # TAP version 13 > # 1..16 > # # Starting 16 tests from 1 test cases. > # # RUN enclave.unclobbered_vdso ... > # # OK enclave.unclobbered_vdso > # ok 1 enclave.unclobbered_vdso > # # RUN enclave.unclobbered_vdso_oversubscribed ... > # # OK enclave.unclobbered_vdso_oversubscribed > # ok 2 enclave.unclobbered_vdso_oversubscribed > # # RUN enclave.unclobbered_vdso_oversubscribed_remove ... > # # main.c:402:unclobbered_vdso_oversubscribed_remove:Creating an > enclave with 98566144 bytes heap may take a while ... > # # main.c:457:unclobbered_vdso_oversubscribed_remove:Changing type of > 98566144 bytes to trimmed may take a while ... > # # main.c:473:unclobbered_vdso_oversubscribed_remove:Entering enclave > to run EACCEPT for each page of 98566144 bytes may take a while ... > # # main.c:494:unclobbered_vdso_oversubscribed_remove:Removing 98566144 > bytes from enclave may take a while ... > # # OK enclave.unclobbered_vdso_oversubscribed_remove > # ok 3 enclave.unclobbered_vdso_oversubscribed_remove > # # RUN enclave.clobbered_vdso ... > # # OK enclave.clobbered_vdso > # ok 4 enclave.clobbered_vdso > # # RUN enclave.clobbered_vdso_and_user_function ... > # # OK enclave.clobbered_vdso_and_user_function > # ok 5 enclave.clobbered_vdso_and_user_function > # # RUN enclave.tcs_entry ... > # # OK enclave.tcs_entry > # ok 6 enclave.tcs_entry > # # RUN enclave.pte_permissions ... > # # OK enclave.pte_permissions > # ok 7 enclave.pte_permissions > # # RUN enclave.tcs_permissions ... > # # OK enclave.tcs_permissions > # ok 8 enclave.tcs_permissions > # # RUN enclave.epcm_permissions ... > # # OK enclave.epcm_permissions > # ok 9 enclave.epcm_permissions > # # RUN enclave.augment ... > # # OK enclave.augment > # ok 10 enclave.augment > # # RUN enclave.augment_via_eaccept ... > # # OK enclave.augment_via_eaccept > # ok 11 enclave.augment_via_eaccept > # # RUN enclave.tcs_create ... > # # OK enclave.tcs_create > # ok 12 enclave.tcs_create > # # RUN enclave.remove_added_page_no_eaccept ... > # # OK enclave.remove_added_page_no_eaccept > # ok 13 enclave.remove_added_page_no_eaccept > # # RUN enclave.remove_added_page_invalid_access ... > # # OK enclave.remove_added_page_invalid_access > # ok 14 enclave.remove_added_page_invalid_access > # # RUN > enclave.remove_added_page_invalid_access_after_eaccept ... > # # OK enclave.remove_added_page_invalid_access_after_eaccept > # ok 15 enclave.remove_added_page_invalid_access_after_eaccept > # # RUN enclave.remove_untouched_page ... > # # OK enclave.remove_untouched_page > # ok 16 enclave.remove_untouched_page > # # PASSED: 16 / 16 tests passed. > # # Totals: pass:16 fail:0 xfail:0 xpass:0 skip:0 error:0 > ok 1 selftests: sgx: test_sgx > > Also cgroups tests are expected to run when invoking "run_tests". > I can add the SGX cgroup tests to the "run_tests" target. But it will need more than the 45secs given by the default timeout especially on platforms with larger EPC. You will need run with --override-timeout (current SGX selftests also timeout for default 45 sec on a server with 4G EPC). > I also wonder do we really want this: > > diff --git a/config/x86_64/default b/config/x86_64/default > index 246b1f3df46..6c3d20af7b4 100644 > --- a/config/x86_64/default > +++ b/config/x86_64/default > @@ -502,6 +502,7 @@ CONFIG_X86_INTEL_TSX_MODE_OFF=y > # CONFIG_X86_INTEL_TSX_MODE_ON is not set > # CONFIG_X86_INTEL_TSX_MODE_AUTO is not set > CONFIG_X86_SGX=y > +CONFIG_CGROUP_SGX_EPC=y > # CONFIG_X86_USER_SHADOW_STACK is not set > CONFIG_EFI=y > CONFIG_EFI_STUB=y > > It is a small change but affects all of downstream, not just OpenSUSE. > > I hard time projecting a situation where you wanted SGX but without > cgroups support so perhaps it would be a better idea to enable cgroups > unconditionally when SGX and cgroups are part of the kernel config? > > Then downstream can just pick the patches and call it a day... > > BR, Jarkko > I don't have issue to remove this config and conditionally compile in SGX cgroup implementation when MISC is configured. Thanks Haitao
On Sat, 13 Apr 2024 16:34:17 -0500, Jarkko Sakkinen <jarkko@kernel.org> wrote: > On Wed Apr 10, 2024 at 9:25 PM EEST, Haitao Huang wrote: >> To run selftests for EPC cgroup: >> >> sudo ./run_epc_cg_selftests.sh >> >> To watch misc cgroup 'current' changes during testing, run this in a >> separate terminal: >> >> ./watch_misc_for_tests.sh current >> >> With different cgroups, the script starts one or multiple concurrent SGX >> selftests (test_sgx), each to run the unclobbered_vdso_oversubscribed >> test case, which loads an enclave of EPC size equal to the EPC capacity >> available on the platform. The script checks results against the >> expectation set for each cgroup and reports success or failure. >> >> The script creates 3 different cgroups at the beginning with following >> expectations: >> >> 1) SMALL - intentionally small enough to fail the test loading an >> enclave of size equal to the capacity. >> 2) LARGE - large enough to run up to 4 concurrent tests but fail some if >> more than 4 concurrent tests are run. The script starts 4 expecting at >> least one test to pass, and then starts 5 expecting at least one test >> to fail. >> 3) LARGER - limit is the same as the capacity, large enough to run lots >> of >> concurrent tests. The script starts 8 of them and expects all pass. >> Then it reruns the same test with one process randomly killed and >> usage checked to be zero after all processes exit. >> >> The script also includes a test with low mem_cg limit and LARGE sgx_epc >> limit to verify that the RAM used for per-cgroup reclamation is charged >> to a proper mem_cg. For this test, it turns off swapping before start, >> and turns swapping back on afterwards. >> >> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> >> --- >> V11: >> - Remove cgroups-tools dependency and make scripts ash compatible. >> (Jarkko) >> - Drop support for cgroup v1 and simplify. (Michal, Jarkko) >> - Add documentation for functions. (Jarkko) >> - Turn off swapping before memcontrol tests and back on after >> - Format and style fixes, name for hard coded values >> >> V7: >> - Added memcontrol test. >> >> V5: >> - Added script with automatic results checking, remove the interactive >> script. >> - The script can run independent from the series below. >> --- >> tools/testing/selftests/sgx/ash_cgexec.sh | 16 + >> .../selftests/sgx/run_epc_cg_selftests.sh | 275 ++++++++++++++++++ >> .../selftests/sgx/watch_misc_for_tests.sh | 11 + >> 3 files changed, 302 insertions(+) >> create mode 100755 tools/testing/selftests/sgx/ash_cgexec.sh >> create mode 100755 tools/testing/selftests/sgx/run_epc_cg_selftests.sh >> create mode 100755 tools/testing/selftests/sgx/watch_misc_for_tests.sh >> >> diff --git a/tools/testing/selftests/sgx/ash_cgexec.sh >> b/tools/testing/selftests/sgx/ash_cgexec.sh >> new file mode 100755 >> index 000000000000..cfa5d2b0e795 >> --- /dev/null >> +++ b/tools/testing/selftests/sgx/ash_cgexec.sh >> @@ -0,0 +1,16 @@ >> +#!/usr/bin/env sh >> +# SPDX-License-Identifier: GPL-2.0 >> +# Copyright(c) 2024 Intel Corporation. >> + >> +# Start a program in a given cgroup. >> +# Supports V2 cgroup paths, relative to /sys/fs/cgroup >> +if [ "$#" -lt 2 ]; then >> + echo "Usage: $0 <v2 cgroup path> <command> [args...]" >> + exit 1 >> +fi >> +# Move this shell to the cgroup. >> +echo 0 >/sys/fs/cgroup/$1/cgroup.procs >> +shift >> +# Execute the command within the cgroup >> +exec "$@" >> + >> diff --git a/tools/testing/selftests/sgx/run_epc_cg_selftests.sh >> b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh >> new file mode 100755 >> index 000000000000..dd56273056fc >> --- /dev/null >> +++ b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh >> @@ -0,0 +1,275 @@ >> +#!/usr/bin/env sh >> +# SPDX-License-Identifier: GPL-2.0 >> +# Copyright(c) 2023, 2024 Intel Corporation. >> + >> +TEST_ROOT_CG=selftest >> +TEST_CG_SUB1=$TEST_ROOT_CG/test1 >> +TEST_CG_SUB2=$TEST_ROOT_CG/test2 >> +# We will only set limit in test1 and run tests in test3 >> +TEST_CG_SUB3=$TEST_ROOT_CG/test1/test3 >> +TEST_CG_SUB4=$TEST_ROOT_CG/test4 >> + >> +# Cgroup v2 only >> +CG_ROOT=/sys/fs/cgroup >> +mkdir -p $CG_ROOT/$TEST_CG_SUB1 >> +mkdir -p $CG_ROOT/$TEST_CG_SUB2 >> +mkdir -p $CG_ROOT/$TEST_CG_SUB3 >> +mkdir -p $CG_ROOT/$TEST_CG_SUB4 >> + >> +# Turn on misc and memory controller in non-leaf nodes >> +echo "+misc" > $CG_ROOT/cgroup.subtree_control && \ >> +echo "+memory" > $CG_ROOT/cgroup.subtree_control && \ >> +echo "+misc" > $CG_ROOT/$TEST_ROOT_CG/cgroup.subtree_control && \ >> +echo "+memory" > $CG_ROOT/$TEST_ROOT_CG/cgroup.subtree_control && \ >> +echo "+misc" > $CG_ROOT/$TEST_CG_SUB1/cgroup.subtree_control >> +if [ $? -ne 0 ]; then >> + echo "# Failed setting up cgroups, make sure misc and memory >> cgroups are enabled." >> + exit 1 >> +fi >> + >> +CAPACITY=$(grep "sgx_epc" "$CG_ROOT/misc.capacity" | awk '{print $2}') >> +# This is below number of VA pages needed for enclave of capacity >> size. So >> +# should fail oversubscribed cases >> +SMALL=$(( CAPACITY / 512 )) >> + >> +# At least load one enclave of capacity size successfully, maybe up to >> 4. >> +# But some may fail if we run more than 4 concurrent enclaves of >> capacity size. >> +LARGE=$(( SMALL * 4 )) >> + >> +# Load lots of enclaves >> +LARGER=$CAPACITY >> +echo "# Setting up limits." >> +echo "sgx_epc $SMALL" > $CG_ROOT/$TEST_CG_SUB1/misc.max && \ >> +echo "sgx_epc $LARGE" > $CG_ROOT/$TEST_CG_SUB2/misc.max && \ >> +echo "sgx_epc $LARGER" > $CG_ROOT/$TEST_CG_SUB4/misc.max >> +if [ $? -ne 0 ]; then >> + echo "# Failed setting up misc limits." >> + exit 1 >> +fi >> + >> +clean_up() >> +{ >> + sleep 2 >> + rmdir $CG_ROOT/$TEST_CG_SUB2 >> + rmdir $CG_ROOT/$TEST_CG_SUB3 >> + rmdir $CG_ROOT/$TEST_CG_SUB4 >> + rmdir $CG_ROOT/$TEST_CG_SUB1 >> + rmdir $CG_ROOT/$TEST_ROOT_CG >> +} >> + >> +timestamp=$(date +%Y%m%d_%H%M%S) >> + >> +test_cmd="./test_sgx -t unclobbered_vdso_oversubscribed" >> + >> +PROCESS_SUCCESS=1 >> +PROCESS_FAILURE=0 >> + >> +# Wait for a process and check for expected exit status. >> +# >> +# Arguments: >> +# $1 - the pid of the process to wait and check. >> +# $2 - 1 if expecting success, 0 for failure. >> +# >> +# Return: >> +# 0 if the exit status of the process matches the expectation. >> +# 1 otherwise. >> +wait_check_process_status() { >> + pid=$1 >> + check_for_success=$2 >> + >> + wait "$pid" >> + status=$? >> + >> + if [ $check_for_success -eq $PROCESS_SUCCESS ] && [ $status -eq 0 >> ]; then >> + echo "# Process $pid succeeded." >> + return 0 >> + elif [ $check_for_success -eq $PROCESS_FAILURE ] && [ $status -ne >> 0 ]; then >> + echo "# Process $pid returned failure." >> + return 0 >> + fi >> + return 1 >> +} >> + >> +# Wait for a set of processes and check for expected exit status >> +# >> +# Arguments: >> +# $1 - 1 if expecting success, 0 for failure. >> +# remaining args - The pids of the processes >> +# >> +# Return: >> +# 0 if exit status of any process matches the expectation. >> +# 1 otherwise. >> +wait_and_detect_for_any() { >> + check_for_success=$1 >> + >> + shift >> + detected=1 # 0 for success detection >> + >> + for pid in $@; do >> + if wait_check_process_status "$pid" "$check_for_success"; then >> + detected=0 >> + # Wait for other processes to exit >> + fi >> + done >> + >> + return $detected >> +} >> + >> +echo "# Start unclobbered_vdso_oversubscribed with SMALL limit, >> expecting failure..." >> +# Always use leaf node of misc cgroups >> +# these may fail on OOM >> +./ash_cgexec.sh $TEST_CG_SUB3 $test_cmd >cgtest_small_$timestamp.log >> 2>&1 >> +if [ $? -eq 0 ]; then >> + echo "# Fail on SMALL limit, not expecting any test passes." >> + clean_up >> + exit 1 >> +else >> + echo "# Test failed as expected." >> +fi >> + >> +echo "# PASSED SMALL limit." >> + >> +echo "# Start 4 concurrent unclobbered_vdso_oversubscribed tests with >> LARGE limit, >> + expecting at least one success...." >> + >> +pids="" >> +for i in 1 2 3 4; do >> + ( >> + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >> >cgtest_large_positive_$timestamp.$i.log 2>&1 >> + ) & >> + pids="$pids $!" >> +done >> + >> + >> +if wait_and_detect_for_any $PROCESS_SUCCESS "$pids"; then >> + echo "# PASSED LARGE limit positive testing." >> +else >> + echo "# Failed on LARGE limit positive testing, no test passes." >> + clean_up >> + exit 1 >> +fi >> + >> +echo "# Start 5 concurrent unclobbered_vdso_oversubscribed tests with >> LARGE limit, >> + expecting at least one failure...." >> +pids="" >> +for i in 1 2 3 4 5; do >> + ( >> + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >> >cgtest_large_negative_$timestamp.$i.log 2>&1 >> + ) & >> + pids="$pids $!" >> +done >> + >> +if wait_and_detect_for_any $PROCESS_FAILURE "$pids"; then >> + echo "# PASSED LARGE limit negative testing." >> +else >> + echo "# Failed on LARGE limit negative testing, no test fails." >> + clean_up >> + exit 1 >> +fi >> + >> +echo "# Start 8 concurrent unclobbered_vdso_oversubscribed tests with >> LARGER limit, >> + expecting no failure...." >> +pids="" >> +for i in 1 2 3 4 5 6 7 8; do >> + ( >> + ./ash_cgexec.sh $TEST_CG_SUB4 $test_cmd >> >cgtest_larger_$timestamp.$i.log 2>&1 >> + ) & >> + pids="$pids $!" >> +done >> + >> +if wait_and_detect_for_any $PROCESS_FAILURE "$pids"; then >> + echo "# Failed on LARGER limit, at least one test fails." >> + clean_up >> + exit 1 >> +else >> + echo "# PASSED LARGER limit tests." >> +fi >> + >> +echo "# Start 8 concurrent unclobbered_vdso_oversubscribed tests with >> LARGER limit, >> + randomly kill one, expecting no failure...." >> +pids="" >> +for i in 1 2 3 4 5 6 7 8; do >> + ( >> + ./ash_cgexec.sh $TEST_CG_SUB4 $test_cmd >> >cgtest_larger_kill_$timestamp.$i.log 2>&1 >> + ) & >> + pids="$pids $!" >> +done >> +random_number=$(awk 'BEGIN{srand();print int(rand()*5)}') >> +sleep $((random_number + 1)) >> + >> +# Randomly select a process to kill >> +# Make sure usage counter not leaked at the end. >> +RANDOM_INDEX=$(awk 'BEGIN{srand();print int(rand()*8)}') >> +counter=0 >> +for pid in $pids; do >> + if [ "$counter" -eq "$RANDOM_INDEX" ]; then >> + PID_TO_KILL=$pid >> + break >> + fi >> + counter=$((counter + 1)) >> +done >> + >> +kill $PID_TO_KILL >> +echo "# Killed process with PID: $PID_TO_KILL" >> + >> +any_failure=0 >> +for pid in $pids; do >> + wait "$pid" >> + status=$? >> + if [ "$pid" != "$PID_TO_KILL" ]; then >> + if [ $status -ne 0 ]; then >> + echo "# Process $pid returned failure." >> + any_failure=1 >> + fi >> + fi >> +done >> + >> +if [ $any_failure -ne 0 ]; then >> + echo "# Failed on random killing, at least one test fails." >> + clean_up >> + exit 1 >> +fi >> +echo "# PASSED LARGER limit test with a process randomly killed." >> + >> +MEM_LIMIT_TOO_SMALL=$((CAPACITY - 2 * LARGE)) >> + >> +echo "$MEM_LIMIT_TOO_SMALL" > $CG_ROOT/$TEST_CG_SUB2/memory.max >> +if [ $? -ne 0 ]; then >> + echo "# Failed creating memory controller." >> + clean_up >> + exit 1 >> +fi >> + >> +echo "# Start 4 concurrent unclobbered_vdso_oversubscribed tests with >> LARGE EPC limit, >> + and too small RAM limit, expecting all failures...." >> +# Ensure swapping off so the OOM killer is activated when mem_cgroup >> limit is hit. >> +swapoff -a >> +pids="" >> +for i in 1 2 3 4; do >> + ( >> + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >> >cgtest_large_oom_$timestamp.$i.log 2>&1 >> + ) & >> + pids="$pids $!" >> +done >> + >> +if wait_and_detect_for_any $PROCESS_SUCCESS "$pids"; then >> + echo "# Failed on tests with memcontrol, some tests did not fail." >> + clean_up >> + swapon -a >> + exit 1 >> +else >> + swapon -a >> + echo "# PASSED LARGE limit tests with memcontrol." >> +fi >> + >> +sleep 2 >> + >> +USAGE=$(grep '^sgx_epc' "$CG_ROOT/$TEST_ROOT_CG/misc.current" | awk >> '{print $2}') >> +if [ "$USAGE" -ne 0 ]; then >> + echo "# Failed: Final usage is $USAGE, not 0." >> +else >> + echo "# PASSED leakage check." >> + echo "# PASSED ALL cgroup limit tests, cleanup cgroups..." >> +fi >> +clean_up >> +echo "# done." >> diff --git a/tools/testing/selftests/sgx/watch_misc_for_tests.sh >> b/tools/testing/selftests/sgx/watch_misc_for_tests.sh >> new file mode 100755 >> index 000000000000..1c9985726ace >> --- /dev/null >> +++ b/tools/testing/selftests/sgx/watch_misc_for_tests.sh >> @@ -0,0 +1,11 @@ >> +#!/usr/bin/env sh >> +# SPDX-License-Identifier: GPL-2.0 >> +# Copyright(c) 2023, 2024 Intel Corporation. >> + >> +if [ -z "$1" ]; then >> + echo "No argument supplied, please provide 'max', 'current', or >> 'events'" >> + exit 1 >> +fi >> + >> +watch -n 1 'find /sys/fs/cgroup -wholename "*/test*/misc.'$1'" -exec \ >> + sh -c '\''echo "$1:"; cat "$1"'\'' _ {} \;' > > I'll compile the kernel with this and see what happens! > > Have you tried to run the test suite from top-level? This is just a > sanity check. I've few times forgot to do this so thus asking :-) > > BR, Jarkko > I added following on https://github.com/haitaohuang/linux/tree/sgx_cg_upstream_v11_plus Please update to run from top-level. --- a/tools/testing/selftests/sgx/Makefile +++ b/tools/testing/selftests/sgx/Makefile @@ -20,7 +20,8 @@ ENCL_LDFLAGS := -Wl,-T,test_encl.lds,--build-id=none ifeq ($(CAN_BUILD_X86_64), 1) TEST_CUSTOM_PROGS := $(OUTPUT)/test_sgx -TEST_FILES := $(OUTPUT)/test_encl.elf +TEST_FILES := $(OUTPUT)/test_encl.elf ash_cgexec.sh +TEST_PROGS := run_epc_cg_selftests.sh all: $(TEST_CUSTOM_PROGS) $(OUTPUT)/test_encl.elf endif ... index dd56273056fc..ba0451fc16bc 100755 --- a/tools/testing/selftests/sgx/run_epc_cg_selftests.sh +++ b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh @@ -2,6 +2,14 @@ # SPDX-License-Identifier: GPL-2.0 # Copyright(c) 2023, 2024 Intel Corporation. + +# Kselftest framework requirement - SKIP code is 4. +ksft_skip=4 +if [ "$(id -u)" -ne 0 ]; then + echo "SKIP: SGX Cgroup tests need root priviledges." + exit $ksft_skip +fi + TEST_ROOT_CG=selftest Thanks Haitao
On Mon Apr 15, 2024 at 6:13 AM EEST, Haitao Huang wrote: > On Sun, 14 Apr 2024 10:01:03 -0500, Jarkko Sakkinen <jarkko@kernel.org> > wrote: > > > On Wed Apr 10, 2024 at 9:25 PM EEST, Haitao Huang wrote: > >> To run selftests for EPC cgroup: > >> > >> sudo ./run_epc_cg_selftests.sh > >> > >> To watch misc cgroup 'current' changes during testing, run this in a > >> separate terminal: > >> > >> ./watch_misc_for_tests.sh current > >> > >> With different cgroups, the script starts one or multiple concurrent SGX > >> selftests (test_sgx), each to run the unclobbered_vdso_oversubscribed > >> test case, which loads an enclave of EPC size equal to the EPC capacity > >> available on the platform. The script checks results against the > >> expectation set for each cgroup and reports success or failure. > >> > >> The script creates 3 different cgroups at the beginning with following > >> expectations: > >> > >> 1) SMALL - intentionally small enough to fail the test loading an > >> enclave of size equal to the capacity. > >> 2) LARGE - large enough to run up to 4 concurrent tests but fail some if > >> more than 4 concurrent tests are run. The script starts 4 expecting at > >> least one test to pass, and then starts 5 expecting at least one test > >> to fail. > >> 3) LARGER - limit is the same as the capacity, large enough to run lots > >> of > >> concurrent tests. The script starts 8 of them and expects all pass. > >> Then it reruns the same test with one process randomly killed and > >> usage checked to be zero after all processes exit. > >> > >> The script also includes a test with low mem_cg limit and LARGE sgx_epc > >> limit to verify that the RAM used for per-cgroup reclamation is charged > >> to a proper mem_cg. For this test, it turns off swapping before start, > >> and turns swapping back on afterwards. > >> > >> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> > >> --- > >> V11: > >> - Remove cgroups-tools dependency and make scripts ash compatible. > >> (Jarkko) > >> - Drop support for cgroup v1 and simplify. (Michal, Jarkko) > >> - Add documentation for functions. (Jarkko) > >> - Turn off swapping before memcontrol tests and back on after > >> - Format and style fixes, name for hard coded values > >> > >> V7: > >> - Added memcontrol test. > >> > >> V5: > >> - Added script with automatic results checking, remove the interactive > >> script. > >> - The script can run independent from the series below. > >> --- > >> tools/testing/selftests/sgx/ash_cgexec.sh | 16 + > >> .../selftests/sgx/run_epc_cg_selftests.sh | 275 ++++++++++++++++++ > >> .../selftests/sgx/watch_misc_for_tests.sh | 11 + > >> 3 files changed, 302 insertions(+) > >> create mode 100755 tools/testing/selftests/sgx/ash_cgexec.sh > >> create mode 100755 tools/testing/selftests/sgx/run_epc_cg_selftests.sh > >> create mode 100755 tools/testing/selftests/sgx/watch_misc_for_tests.sh > >> > >> diff --git a/tools/testing/selftests/sgx/ash_cgexec.sh > >> b/tools/testing/selftests/sgx/ash_cgexec.sh > >> new file mode 100755 > >> index 000000000000..cfa5d2b0e795 > >> --- /dev/null > >> +++ b/tools/testing/selftests/sgx/ash_cgexec.sh > >> @@ -0,0 +1,16 @@ > >> +#!/usr/bin/env sh > >> +# SPDX-License-Identifier: GPL-2.0 > >> +# Copyright(c) 2024 Intel Corporation. > >> + > >> +# Start a program in a given cgroup. > >> +# Supports V2 cgroup paths, relative to /sys/fs/cgroup > >> +if [ "$#" -lt 2 ]; then > >> + echo "Usage: $0 <v2 cgroup path> <command> [args...]" > >> + exit 1 > >> +fi > >> +# Move this shell to the cgroup. > >> +echo 0 >/sys/fs/cgroup/$1/cgroup.procs > >> +shift > >> +# Execute the command within the cgroup > >> +exec "$@" > >> + > >> diff --git a/tools/testing/selftests/sgx/run_epc_cg_selftests.sh > >> b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh > >> new file mode 100755 > >> index 000000000000..dd56273056fc > >> --- /dev/null > >> +++ b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh > >> @@ -0,0 +1,275 @@ > >> +#!/usr/bin/env sh > >> +# SPDX-License-Identifier: GPL-2.0 > >> +# Copyright(c) 2023, 2024 Intel Corporation. > >> + > >> +TEST_ROOT_CG=selftest > >> +TEST_CG_SUB1=$TEST_ROOT_CG/test1 > >> +TEST_CG_SUB2=$TEST_ROOT_CG/test2 > >> +# We will only set limit in test1 and run tests in test3 > >> +TEST_CG_SUB3=$TEST_ROOT_CG/test1/test3 > >> +TEST_CG_SUB4=$TEST_ROOT_CG/test4 > >> + > >> +# Cgroup v2 only > >> +CG_ROOT=/sys/fs/cgroup > >> +mkdir -p $CG_ROOT/$TEST_CG_SUB1 > >> +mkdir -p $CG_ROOT/$TEST_CG_SUB2 > >> +mkdir -p $CG_ROOT/$TEST_CG_SUB3 > >> +mkdir -p $CG_ROOT/$TEST_CG_SUB4 > >> + > >> +# Turn on misc and memory controller in non-leaf nodes > >> +echo "+misc" > $CG_ROOT/cgroup.subtree_control && \ > >> +echo "+memory" > $CG_ROOT/cgroup.subtree_control && \ > >> +echo "+misc" > $CG_ROOT/$TEST_ROOT_CG/cgroup.subtree_control && \ > >> +echo "+memory" > $CG_ROOT/$TEST_ROOT_CG/cgroup.subtree_control && \ > >> +echo "+misc" > $CG_ROOT/$TEST_CG_SUB1/cgroup.subtree_control > >> +if [ $? -ne 0 ]; then > >> + echo "# Failed setting up cgroups, make sure misc and memory > >> cgroups are enabled." > >> + exit 1 > >> +fi > >> + > >> +CAPACITY=$(grep "sgx_epc" "$CG_ROOT/misc.capacity" | awk '{print $2}') > >> +# This is below number of VA pages needed for enclave of capacity > >> size. So > >> +# should fail oversubscribed cases > >> +SMALL=$(( CAPACITY / 512 )) > >> + > >> +# At least load one enclave of capacity size successfully, maybe up to > >> 4. > >> +# But some may fail if we run more than 4 concurrent enclaves of > >> capacity size. > >> +LARGE=$(( SMALL * 4 )) > >> + > >> +# Load lots of enclaves > >> +LARGER=$CAPACITY > >> +echo "# Setting up limits." > >> +echo "sgx_epc $SMALL" > $CG_ROOT/$TEST_CG_SUB1/misc.max && \ > >> +echo "sgx_epc $LARGE" > $CG_ROOT/$TEST_CG_SUB2/misc.max && \ > >> +echo "sgx_epc $LARGER" > $CG_ROOT/$TEST_CG_SUB4/misc.max > >> +if [ $? -ne 0 ]; then > >> + echo "# Failed setting up misc limits." > >> + exit 1 > >> +fi > >> + > >> +clean_up() > >> +{ > >> + sleep 2 > >> + rmdir $CG_ROOT/$TEST_CG_SUB2 > >> + rmdir $CG_ROOT/$TEST_CG_SUB3 > >> + rmdir $CG_ROOT/$TEST_CG_SUB4 > >> + rmdir $CG_ROOT/$TEST_CG_SUB1 > >> + rmdir $CG_ROOT/$TEST_ROOT_CG > >> +} > >> + > >> +timestamp=$(date +%Y%m%d_%H%M%S) > >> + > >> +test_cmd="./test_sgx -t unclobbered_vdso_oversubscribed" > >> + > >> +PROCESS_SUCCESS=1 > >> +PROCESS_FAILURE=0 > >> + > >> +# Wait for a process and check for expected exit status. > >> +# > >> +# Arguments: > >> +# $1 - the pid of the process to wait and check. > >> +# $2 - 1 if expecting success, 0 for failure. > >> +# > >> +# Return: > >> +# 0 if the exit status of the process matches the expectation. > >> +# 1 otherwise. > >> +wait_check_process_status() { > >> + pid=$1 > >> + check_for_success=$2 > >> + > >> + wait "$pid" > >> + status=$? > >> + > >> + if [ $check_for_success -eq $PROCESS_SUCCESS ] && [ $status -eq 0 > >> ]; then > >> + echo "# Process $pid succeeded." > >> + return 0 > >> + elif [ $check_for_success -eq $PROCESS_FAILURE ] && [ $status -ne > >> 0 ]; then > >> + echo "# Process $pid returned failure." > >> + return 0 > >> + fi > >> + return 1 > >> +} > >> + > >> +# Wait for a set of processes and check for expected exit status > >> +# > >> +# Arguments: > >> +# $1 - 1 if expecting success, 0 for failure. > >> +# remaining args - The pids of the processes > >> +# > >> +# Return: > >> +# 0 if exit status of any process matches the expectation. > >> +# 1 otherwise. > >> +wait_and_detect_for_any() { > >> + check_for_success=$1 > >> + > >> + shift > >> + detected=1 # 0 for success detection > >> + > >> + for pid in $@; do > >> + if wait_check_process_status "$pid" "$check_for_success"; then > >> + detected=0 > >> + # Wait for other processes to exit > >> + fi > >> + done > >> + > >> + return $detected > >> +} > >> + > >> +echo "# Start unclobbered_vdso_oversubscribed with SMALL limit, > >> expecting failure..." > >> +# Always use leaf node of misc cgroups > >> +# these may fail on OOM > >> +./ash_cgexec.sh $TEST_CG_SUB3 $test_cmd >cgtest_small_$timestamp.log > >> 2>&1 > >> +if [ $? -eq 0 ]; then > >> + echo "# Fail on SMALL limit, not expecting any test passes." > >> + clean_up > >> + exit 1 > >> +else > >> + echo "# Test failed as expected." > >> +fi > >> + > >> +echo "# PASSED SMALL limit." > >> + > >> +echo "# Start 4 concurrent unclobbered_vdso_oversubscribed tests with > >> LARGE limit, > >> + expecting at least one success...." > >> + > >> +pids="" > >> +for i in 1 2 3 4; do > >> + ( > >> + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd > >> >cgtest_large_positive_$timestamp.$i.log 2>&1 > >> + ) & > >> + pids="$pids $!" > >> +done > >> + > >> + > >> +if wait_and_detect_for_any $PROCESS_SUCCESS "$pids"; then > >> + echo "# PASSED LARGE limit positive testing." > >> +else > >> + echo "# Failed on LARGE limit positive testing, no test passes." > >> + clean_up > >> + exit 1 > >> +fi > >> + > >> +echo "# Start 5 concurrent unclobbered_vdso_oversubscribed tests with > >> LARGE limit, > >> + expecting at least one failure...." > >> +pids="" > >> +for i in 1 2 3 4 5; do > >> + ( > >> + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd > >> >cgtest_large_negative_$timestamp.$i.log 2>&1 > >> + ) & > >> + pids="$pids $!" > >> +done > >> + > >> +if wait_and_detect_for_any $PROCESS_FAILURE "$pids"; then > >> + echo "# PASSED LARGE limit negative testing." > >> +else > >> + echo "# Failed on LARGE limit negative testing, no test fails." > >> + clean_up > >> + exit 1 > >> +fi > >> + > >> +echo "# Start 8 concurrent unclobbered_vdso_oversubscribed tests with > >> LARGER limit, > >> + expecting no failure...." > >> +pids="" > >> +for i in 1 2 3 4 5 6 7 8; do > >> + ( > >> + ./ash_cgexec.sh $TEST_CG_SUB4 $test_cmd > >> >cgtest_larger_$timestamp.$i.log 2>&1 > >> + ) & > >> + pids="$pids $!" > >> +done > >> + > >> +if wait_and_detect_for_any $PROCESS_FAILURE "$pids"; then > >> + echo "# Failed on LARGER limit, at least one test fails." > >> + clean_up > >> + exit 1 > >> +else > >> + echo "# PASSED LARGER limit tests." > >> +fi > >> + > >> +echo "# Start 8 concurrent unclobbered_vdso_oversubscribed tests with > >> LARGER limit, > >> + randomly kill one, expecting no failure...." > >> +pids="" > >> +for i in 1 2 3 4 5 6 7 8; do > >> + ( > >> + ./ash_cgexec.sh $TEST_CG_SUB4 $test_cmd > >> >cgtest_larger_kill_$timestamp.$i.log 2>&1 > >> + ) & > >> + pids="$pids $!" > >> +done > >> +random_number=$(awk 'BEGIN{srand();print int(rand()*5)}') > >> +sleep $((random_number + 1)) > >> + > >> +# Randomly select a process to kill > >> +# Make sure usage counter not leaked at the end. > >> +RANDOM_INDEX=$(awk 'BEGIN{srand();print int(rand()*8)}') > >> +counter=0 > >> +for pid in $pids; do > >> + if [ "$counter" -eq "$RANDOM_INDEX" ]; then > >> + PID_TO_KILL=$pid > >> + break > >> + fi > >> + counter=$((counter + 1)) > >> +done > >> + > >> +kill $PID_TO_KILL > >> +echo "# Killed process with PID: $PID_TO_KILL" > >> + > >> +any_failure=0 > >> +for pid in $pids; do > >> + wait "$pid" > >> + status=$? > >> + if [ "$pid" != "$PID_TO_KILL" ]; then > >> + if [ $status -ne 0 ]; then > >> + echo "# Process $pid returned failure." > >> + any_failure=1 > >> + fi > >> + fi > >> +done > >> + > >> +if [ $any_failure -ne 0 ]; then > >> + echo "# Failed on random killing, at least one test fails." > >> + clean_up > >> + exit 1 > >> +fi > >> +echo "# PASSED LARGER limit test with a process randomly killed." > >> + > >> +MEM_LIMIT_TOO_SMALL=$((CAPACITY - 2 * LARGE)) > >> + > >> +echo "$MEM_LIMIT_TOO_SMALL" > $CG_ROOT/$TEST_CG_SUB2/memory.max > >> +if [ $? -ne 0 ]; then > >> + echo "# Failed creating memory controller." > >> + clean_up > >> + exit 1 > >> +fi > >> + > >> +echo "# Start 4 concurrent unclobbered_vdso_oversubscribed tests with > >> LARGE EPC limit, > >> + and too small RAM limit, expecting all failures...." > >> +# Ensure swapping off so the OOM killer is activated when mem_cgroup > >> limit is hit. > >> +swapoff -a > >> +pids="" > >> +for i in 1 2 3 4; do > >> + ( > >> + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd > >> >cgtest_large_oom_$timestamp.$i.log 2>&1 > >> + ) & > >> + pids="$pids $!" > >> +done > >> + > >> +if wait_and_detect_for_any $PROCESS_SUCCESS "$pids"; then > >> + echo "# Failed on tests with memcontrol, some tests did not fail." > >> + clean_up > >> + swapon -a > >> + exit 1 > >> +else > >> + swapon -a > >> + echo "# PASSED LARGE limit tests with memcontrol." > >> +fi > >> + > >> +sleep 2 > >> + > >> +USAGE=$(grep '^sgx_epc' "$CG_ROOT/$TEST_ROOT_CG/misc.current" | awk > >> '{print $2}') > >> +if [ "$USAGE" -ne 0 ]; then > >> + echo "# Failed: Final usage is $USAGE, not 0." > >> +else > >> + echo "# PASSED leakage check." > >> + echo "# PASSED ALL cgroup limit tests, cleanup cgroups..." > >> +fi > >> +clean_up > >> +echo "# done." > >> diff --git a/tools/testing/selftests/sgx/watch_misc_for_tests.sh > >> b/tools/testing/selftests/sgx/watch_misc_for_tests.sh > >> new file mode 100755 > >> index 000000000000..1c9985726ace > >> --- /dev/null > >> +++ b/tools/testing/selftests/sgx/watch_misc_for_tests.sh > >> @@ -0,0 +1,11 @@ > >> +#!/usr/bin/env sh > >> +# SPDX-License-Identifier: GPL-2.0 > >> +# Copyright(c) 2023, 2024 Intel Corporation. > >> + > >> +if [ -z "$1" ]; then > >> + echo "No argument supplied, please provide 'max', 'current', or > >> 'events'" > >> + exit 1 > >> +fi > >> + > >> +watch -n 1 'find /sys/fs/cgroup -wholename "*/test*/misc.'$1'" -exec \ > >> + sh -c '\''echo "$1:"; cat "$1"'\'' _ {} \;' > > > > So this is what happens now: > > > > jarkko@mustatorvisieni:~/linux-tpmdd> make -C > > tools/testing/selftests/sgx run_tests > > make: Entering directory > > '/home/jarkko/linux-tpmdd/tools/testing/selftests/sgx' > > TAP version 13 > > 1..1 > > # timeout set to 45 > > # selftests: sgx: test_sgx > > # TAP version 13 > > # 1..16 > > # # Starting 16 tests from 1 test cases. > > # # RUN enclave.unclobbered_vdso ... > > # # OK enclave.unclobbered_vdso > > # ok 1 enclave.unclobbered_vdso > > # # RUN enclave.unclobbered_vdso_oversubscribed ... > > # # OK enclave.unclobbered_vdso_oversubscribed > > # ok 2 enclave.unclobbered_vdso_oversubscribed > > # # RUN enclave.unclobbered_vdso_oversubscribed_remove ... > > # # main.c:402:unclobbered_vdso_oversubscribed_remove:Creating an > > enclave with 98566144 bytes heap may take a while ... > > # # main.c:457:unclobbered_vdso_oversubscribed_remove:Changing type of > > 98566144 bytes to trimmed may take a while ... > > # # main.c:473:unclobbered_vdso_oversubscribed_remove:Entering enclave > > to run EACCEPT for each page of 98566144 bytes may take a while ... > > # # main.c:494:unclobbered_vdso_oversubscribed_remove:Removing 98566144 > > bytes from enclave may take a while ... > > # # OK enclave.unclobbered_vdso_oversubscribed_remove > > # ok 3 enclave.unclobbered_vdso_oversubscribed_remove > > # # RUN enclave.clobbered_vdso ... > > # # OK enclave.clobbered_vdso > > # ok 4 enclave.clobbered_vdso > > # # RUN enclave.clobbered_vdso_and_user_function ... > > # # OK enclave.clobbered_vdso_and_user_function > > # ok 5 enclave.clobbered_vdso_and_user_function > > # # RUN enclave.tcs_entry ... > > # # OK enclave.tcs_entry > > # ok 6 enclave.tcs_entry > > # # RUN enclave.pte_permissions ... > > # # OK enclave.pte_permissions > > # ok 7 enclave.pte_permissions > > # # RUN enclave.tcs_permissions ... > > # # OK enclave.tcs_permissions > > # ok 8 enclave.tcs_permissions > > # # RUN enclave.epcm_permissions ... > > # # OK enclave.epcm_permissions > > # ok 9 enclave.epcm_permissions > > # # RUN enclave.augment ... > > # # OK enclave.augment > > # ok 10 enclave.augment > > # # RUN enclave.augment_via_eaccept ... > > # # OK enclave.augment_via_eaccept > > # ok 11 enclave.augment_via_eaccept > > # # RUN enclave.tcs_create ... > > # # OK enclave.tcs_create > > # ok 12 enclave.tcs_create > > # # RUN enclave.remove_added_page_no_eaccept ... > > # # OK enclave.remove_added_page_no_eaccept > > # ok 13 enclave.remove_added_page_no_eaccept > > # # RUN enclave.remove_added_page_invalid_access ... > > # # OK enclave.remove_added_page_invalid_access > > # ok 14 enclave.remove_added_page_invalid_access > > # # RUN > > enclave.remove_added_page_invalid_access_after_eaccept ... > > # # OK enclave.remove_added_page_invalid_access_after_eaccept > > # ok 15 enclave.remove_added_page_invalid_access_after_eaccept > > # # RUN enclave.remove_untouched_page ... > > # # OK enclave.remove_untouched_page > > # ok 16 enclave.remove_untouched_page > > # # PASSED: 16 / 16 tests passed. > > # # Totals: pass:16 fail:0 xfail:0 xpass:0 skip:0 error:0 > > ok 1 selftests: sgx: test_sgx > > > > Also cgroups tests are expected to run when invoking "run_tests". > > > I can add the SGX cgroup tests to the "run_tests" target. But it will need > more than the 45secs given by the default timeout especially on platforms > with larger EPC. You will need run with --override-timeout > (current SGX selftests also timeout for default 45 sec on a server with 4G > EPC). > > > > I also wonder do we really want this: > > > > diff --git a/config/x86_64/default b/config/x86_64/default > > index 246b1f3df46..6c3d20af7b4 100644 > > --- a/config/x86_64/default > > +++ b/config/x86_64/default > > @@ -502,6 +502,7 @@ CONFIG_X86_INTEL_TSX_MODE_OFF=y > > # CONFIG_X86_INTEL_TSX_MODE_ON is not set > > # CONFIG_X86_INTEL_TSX_MODE_AUTO is not set > > CONFIG_X86_SGX=y > > +CONFIG_CGROUP_SGX_EPC=y > > # CONFIG_X86_USER_SHADOW_STACK is not set > > CONFIG_EFI=y > > CONFIG_EFI_STUB=y > > > > It is a small change but affects all of downstream, not just OpenSUSE. > > > > I hard time projecting a situation where you wanted SGX but without > > cgroups support so perhaps it would be a better idea to enable cgroups > > unconditionally when SGX and cgroups are part of the kernel config? > > > > Then downstream can just pick the patches and call it a day... > > > > BR, Jarkko > > > > I don't have issue to remove this config and conditionally compile in SGX > cgroup implementation when MISC is configured. I did run the basic test by manually creating the cgroup so you could add tested-by from my side to the other kernel patches expect this one :-) I've reviewed it enough rounds and given various code suggestions etc. For me it is "good enough" or has been for a while. I just want this test to work so that people doing kernel QA will automatically get it to their testing cycle. That is why proper integration to kselftest framework is a must. BR, Jarkko
On Mon Apr 15, 2024 at 8:32 PM EEST, Haitao Huang wrote: > On Sat, 13 Apr 2024 16:34:17 -0500, Jarkko Sakkinen <jarkko@kernel.org> > wrote: > > > On Wed Apr 10, 2024 at 9:25 PM EEST, Haitao Huang wrote: > >> To run selftests for EPC cgroup: > >> > >> sudo ./run_epc_cg_selftests.sh > >> > >> To watch misc cgroup 'current' changes during testing, run this in a > >> separate terminal: > >> > >> ./watch_misc_for_tests.sh current > >> > >> With different cgroups, the script starts one or multiple concurrent SGX > >> selftests (test_sgx), each to run the unclobbered_vdso_oversubscribed > >> test case, which loads an enclave of EPC size equal to the EPC capacity > >> available on the platform. The script checks results against the > >> expectation set for each cgroup and reports success or failure. > >> > >> The script creates 3 different cgroups at the beginning with following > >> expectations: > >> > >> 1) SMALL - intentionally small enough to fail the test loading an > >> enclave of size equal to the capacity. > >> 2) LARGE - large enough to run up to 4 concurrent tests but fail some if > >> more than 4 concurrent tests are run. The script starts 4 expecting at > >> least one test to pass, and then starts 5 expecting at least one test > >> to fail. > >> 3) LARGER - limit is the same as the capacity, large enough to run lots > >> of > >> concurrent tests. The script starts 8 of them and expects all pass. > >> Then it reruns the same test with one process randomly killed and > >> usage checked to be zero after all processes exit. > >> > >> The script also includes a test with low mem_cg limit and LARGE sgx_epc > >> limit to verify that the RAM used for per-cgroup reclamation is charged > >> to a proper mem_cg. For this test, it turns off swapping before start, > >> and turns swapping back on afterwards. > >> > >> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> > >> --- > >> V11: > >> - Remove cgroups-tools dependency and make scripts ash compatible. > >> (Jarkko) > >> - Drop support for cgroup v1 and simplify. (Michal, Jarkko) > >> - Add documentation for functions. (Jarkko) > >> - Turn off swapping before memcontrol tests and back on after > >> - Format and style fixes, name for hard coded values > >> > >> V7: > >> - Added memcontrol test. > >> > >> V5: > >> - Added script with automatic results checking, remove the interactive > >> script. > >> - The script can run independent from the series below. > >> --- > >> tools/testing/selftests/sgx/ash_cgexec.sh | 16 + > >> .../selftests/sgx/run_epc_cg_selftests.sh | 275 ++++++++++++++++++ > >> .../selftests/sgx/watch_misc_for_tests.sh | 11 + > >> 3 files changed, 302 insertions(+) > >> create mode 100755 tools/testing/selftests/sgx/ash_cgexec.sh > >> create mode 100755 tools/testing/selftests/sgx/run_epc_cg_selftests.sh > >> create mode 100755 tools/testing/selftests/sgx/watch_misc_for_tests.sh > >> > >> diff --git a/tools/testing/selftests/sgx/ash_cgexec.sh > >> b/tools/testing/selftests/sgx/ash_cgexec.sh > >> new file mode 100755 > >> index 000000000000..cfa5d2b0e795 > >> --- /dev/null > >> +++ b/tools/testing/selftests/sgx/ash_cgexec.sh > >> @@ -0,0 +1,16 @@ > >> +#!/usr/bin/env sh > >> +# SPDX-License-Identifier: GPL-2.0 > >> +# Copyright(c) 2024 Intel Corporation. > >> + > >> +# Start a program in a given cgroup. > >> +# Supports V2 cgroup paths, relative to /sys/fs/cgroup > >> +if [ "$#" -lt 2 ]; then > >> + echo "Usage: $0 <v2 cgroup path> <command> [args...]" > >> + exit 1 > >> +fi > >> +# Move this shell to the cgroup. > >> +echo 0 >/sys/fs/cgroup/$1/cgroup.procs > >> +shift > >> +# Execute the command within the cgroup > >> +exec "$@" > >> + > >> diff --git a/tools/testing/selftests/sgx/run_epc_cg_selftests.sh > >> b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh > >> new file mode 100755 > >> index 000000000000..dd56273056fc > >> --- /dev/null > >> +++ b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh > >> @@ -0,0 +1,275 @@ > >> +#!/usr/bin/env sh > >> +# SPDX-License-Identifier: GPL-2.0 > >> +# Copyright(c) 2023, 2024 Intel Corporation. > >> + > >> +TEST_ROOT_CG=selftest > >> +TEST_CG_SUB1=$TEST_ROOT_CG/test1 > >> +TEST_CG_SUB2=$TEST_ROOT_CG/test2 > >> +# We will only set limit in test1 and run tests in test3 > >> +TEST_CG_SUB3=$TEST_ROOT_CG/test1/test3 > >> +TEST_CG_SUB4=$TEST_ROOT_CG/test4 > >> + > >> +# Cgroup v2 only > >> +CG_ROOT=/sys/fs/cgroup > >> +mkdir -p $CG_ROOT/$TEST_CG_SUB1 > >> +mkdir -p $CG_ROOT/$TEST_CG_SUB2 > >> +mkdir -p $CG_ROOT/$TEST_CG_SUB3 > >> +mkdir -p $CG_ROOT/$TEST_CG_SUB4 > >> + > >> +# Turn on misc and memory controller in non-leaf nodes > >> +echo "+misc" > $CG_ROOT/cgroup.subtree_control && \ > >> +echo "+memory" > $CG_ROOT/cgroup.subtree_control && \ > >> +echo "+misc" > $CG_ROOT/$TEST_ROOT_CG/cgroup.subtree_control && \ > >> +echo "+memory" > $CG_ROOT/$TEST_ROOT_CG/cgroup.subtree_control && \ > >> +echo "+misc" > $CG_ROOT/$TEST_CG_SUB1/cgroup.subtree_control > >> +if [ $? -ne 0 ]; then > >> + echo "# Failed setting up cgroups, make sure misc and memory > >> cgroups are enabled." > >> + exit 1 > >> +fi > >> + > >> +CAPACITY=$(grep "sgx_epc" "$CG_ROOT/misc.capacity" | awk '{print $2}') > >> +# This is below number of VA pages needed for enclave of capacity > >> size. So > >> +# should fail oversubscribed cases > >> +SMALL=$(( CAPACITY / 512 )) > >> + > >> +# At least load one enclave of capacity size successfully, maybe up to > >> 4. > >> +# But some may fail if we run more than 4 concurrent enclaves of > >> capacity size. > >> +LARGE=$(( SMALL * 4 )) > >> + > >> +# Load lots of enclaves > >> +LARGER=$CAPACITY > >> +echo "# Setting up limits." > >> +echo "sgx_epc $SMALL" > $CG_ROOT/$TEST_CG_SUB1/misc.max && \ > >> +echo "sgx_epc $LARGE" > $CG_ROOT/$TEST_CG_SUB2/misc.max && \ > >> +echo "sgx_epc $LARGER" > $CG_ROOT/$TEST_CG_SUB4/misc.max > >> +if [ $? -ne 0 ]; then > >> + echo "# Failed setting up misc limits." > >> + exit 1 > >> +fi > >> + > >> +clean_up() > >> +{ > >> + sleep 2 > >> + rmdir $CG_ROOT/$TEST_CG_SUB2 > >> + rmdir $CG_ROOT/$TEST_CG_SUB3 > >> + rmdir $CG_ROOT/$TEST_CG_SUB4 > >> + rmdir $CG_ROOT/$TEST_CG_SUB1 > >> + rmdir $CG_ROOT/$TEST_ROOT_CG > >> +} > >> + > >> +timestamp=$(date +%Y%m%d_%H%M%S) > >> + > >> +test_cmd="./test_sgx -t unclobbered_vdso_oversubscribed" > >> + > >> +PROCESS_SUCCESS=1 > >> +PROCESS_FAILURE=0 > >> + > >> +# Wait for a process and check for expected exit status. > >> +# > >> +# Arguments: > >> +# $1 - the pid of the process to wait and check. > >> +# $2 - 1 if expecting success, 0 for failure. > >> +# > >> +# Return: > >> +# 0 if the exit status of the process matches the expectation. > >> +# 1 otherwise. > >> +wait_check_process_status() { > >> + pid=$1 > >> + check_for_success=$2 > >> + > >> + wait "$pid" > >> + status=$? > >> + > >> + if [ $check_for_success -eq $PROCESS_SUCCESS ] && [ $status -eq 0 > >> ]; then > >> + echo "# Process $pid succeeded." > >> + return 0 > >> + elif [ $check_for_success -eq $PROCESS_FAILURE ] && [ $status -ne > >> 0 ]; then > >> + echo "# Process $pid returned failure." > >> + return 0 > >> + fi > >> + return 1 > >> +} > >> + > >> +# Wait for a set of processes and check for expected exit status > >> +# > >> +# Arguments: > >> +# $1 - 1 if expecting success, 0 for failure. > >> +# remaining args - The pids of the processes > >> +# > >> +# Return: > >> +# 0 if exit status of any process matches the expectation. > >> +# 1 otherwise. > >> +wait_and_detect_for_any() { > >> + check_for_success=$1 > >> + > >> + shift > >> + detected=1 # 0 for success detection > >> + > >> + for pid in $@; do > >> + if wait_check_process_status "$pid" "$check_for_success"; then > >> + detected=0 > >> + # Wait for other processes to exit > >> + fi > >> + done > >> + > >> + return $detected > >> +} > >> + > >> +echo "# Start unclobbered_vdso_oversubscribed with SMALL limit, > >> expecting failure..." > >> +# Always use leaf node of misc cgroups > >> +# these may fail on OOM > >> +./ash_cgexec.sh $TEST_CG_SUB3 $test_cmd >cgtest_small_$timestamp.log > >> 2>&1 > >> +if [ $? -eq 0 ]; then > >> + echo "# Fail on SMALL limit, not expecting any test passes." > >> + clean_up > >> + exit 1 > >> +else > >> + echo "# Test failed as expected." > >> +fi > >> + > >> +echo "# PASSED SMALL limit." > >> + > >> +echo "# Start 4 concurrent unclobbered_vdso_oversubscribed tests with > >> LARGE limit, > >> + expecting at least one success...." > >> + > >> +pids="" > >> +for i in 1 2 3 4; do > >> + ( > >> + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd > >> >cgtest_large_positive_$timestamp.$i.log 2>&1 > >> + ) & > >> + pids="$pids $!" > >> +done > >> + > >> + > >> +if wait_and_detect_for_any $PROCESS_SUCCESS "$pids"; then > >> + echo "# PASSED LARGE limit positive testing." > >> +else > >> + echo "# Failed on LARGE limit positive testing, no test passes." > >> + clean_up > >> + exit 1 > >> +fi > >> + > >> +echo "# Start 5 concurrent unclobbered_vdso_oversubscribed tests with > >> LARGE limit, > >> + expecting at least one failure...." > >> +pids="" > >> +for i in 1 2 3 4 5; do > >> + ( > >> + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd > >> >cgtest_large_negative_$timestamp.$i.log 2>&1 > >> + ) & > >> + pids="$pids $!" > >> +done > >> + > >> +if wait_and_detect_for_any $PROCESS_FAILURE "$pids"; then > >> + echo "# PASSED LARGE limit negative testing." > >> +else > >> + echo "# Failed on LARGE limit negative testing, no test fails." > >> + clean_up > >> + exit 1 > >> +fi > >> + > >> +echo "# Start 8 concurrent unclobbered_vdso_oversubscribed tests with > >> LARGER limit, > >> + expecting no failure...." > >> +pids="" > >> +for i in 1 2 3 4 5 6 7 8; do > >> + ( > >> + ./ash_cgexec.sh $TEST_CG_SUB4 $test_cmd > >> >cgtest_larger_$timestamp.$i.log 2>&1 > >> + ) & > >> + pids="$pids $!" > >> +done > >> + > >> +if wait_and_detect_for_any $PROCESS_FAILURE "$pids"; then > >> + echo "# Failed on LARGER limit, at least one test fails." > >> + clean_up > >> + exit 1 > >> +else > >> + echo "# PASSED LARGER limit tests." > >> +fi > >> + > >> +echo "# Start 8 concurrent unclobbered_vdso_oversubscribed tests with > >> LARGER limit, > >> + randomly kill one, expecting no failure...." > >> +pids="" > >> +for i in 1 2 3 4 5 6 7 8; do > >> + ( > >> + ./ash_cgexec.sh $TEST_CG_SUB4 $test_cmd > >> >cgtest_larger_kill_$timestamp.$i.log 2>&1 > >> + ) & > >> + pids="$pids $!" > >> +done > >> +random_number=$(awk 'BEGIN{srand();print int(rand()*5)}') > >> +sleep $((random_number + 1)) > >> + > >> +# Randomly select a process to kill > >> +# Make sure usage counter not leaked at the end. > >> +RANDOM_INDEX=$(awk 'BEGIN{srand();print int(rand()*8)}') > >> +counter=0 > >> +for pid in $pids; do > >> + if [ "$counter" -eq "$RANDOM_INDEX" ]; then > >> + PID_TO_KILL=$pid > >> + break > >> + fi > >> + counter=$((counter + 1)) > >> +done > >> + > >> +kill $PID_TO_KILL > >> +echo "# Killed process with PID: $PID_TO_KILL" > >> + > >> +any_failure=0 > >> +for pid in $pids; do > >> + wait "$pid" > >> + status=$? > >> + if [ "$pid" != "$PID_TO_KILL" ]; then > >> + if [ $status -ne 0 ]; then > >> + echo "# Process $pid returned failure." > >> + any_failure=1 > >> + fi > >> + fi > >> +done > >> + > >> +if [ $any_failure -ne 0 ]; then > >> + echo "# Failed on random killing, at least one test fails." > >> + clean_up > >> + exit 1 > >> +fi > >> +echo "# PASSED LARGER limit test with a process randomly killed." > >> + > >> +MEM_LIMIT_TOO_SMALL=$((CAPACITY - 2 * LARGE)) > >> + > >> +echo "$MEM_LIMIT_TOO_SMALL" > $CG_ROOT/$TEST_CG_SUB2/memory.max > >> +if [ $? -ne 0 ]; then > >> + echo "# Failed creating memory controller." > >> + clean_up > >> + exit 1 > >> +fi > >> + > >> +echo "# Start 4 concurrent unclobbered_vdso_oversubscribed tests with > >> LARGE EPC limit, > >> + and too small RAM limit, expecting all failures...." > >> +# Ensure swapping off so the OOM killer is activated when mem_cgroup > >> limit is hit. > >> +swapoff -a > >> +pids="" > >> +for i in 1 2 3 4; do > >> + ( > >> + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd > >> >cgtest_large_oom_$timestamp.$i.log 2>&1 > >> + ) & > >> + pids="$pids $!" > >> +done > >> + > >> +if wait_and_detect_for_any $PROCESS_SUCCESS "$pids"; then > >> + echo "# Failed on tests with memcontrol, some tests did not fail." > >> + clean_up > >> + swapon -a > >> + exit 1 > >> +else > >> + swapon -a > >> + echo "# PASSED LARGE limit tests with memcontrol." > >> +fi > >> + > >> +sleep 2 > >> + > >> +USAGE=$(grep '^sgx_epc' "$CG_ROOT/$TEST_ROOT_CG/misc.current" | awk > >> '{print $2}') > >> +if [ "$USAGE" -ne 0 ]; then > >> + echo "# Failed: Final usage is $USAGE, not 0." > >> +else > >> + echo "# PASSED leakage check." > >> + echo "# PASSED ALL cgroup limit tests, cleanup cgroups..." > >> +fi > >> +clean_up > >> +echo "# done." > >> diff --git a/tools/testing/selftests/sgx/watch_misc_for_tests.sh > >> b/tools/testing/selftests/sgx/watch_misc_for_tests.sh > >> new file mode 100755 > >> index 000000000000..1c9985726ace > >> --- /dev/null > >> +++ b/tools/testing/selftests/sgx/watch_misc_for_tests.sh > >> @@ -0,0 +1,11 @@ > >> +#!/usr/bin/env sh > >> +# SPDX-License-Identifier: GPL-2.0 > >> +# Copyright(c) 2023, 2024 Intel Corporation. > >> + > >> +if [ -z "$1" ]; then > >> + echo "No argument supplied, please provide 'max', 'current', or > >> 'events'" > >> + exit 1 > >> +fi > >> + > >> +watch -n 1 'find /sys/fs/cgroup -wholename "*/test*/misc.'$1'" -exec \ > >> + sh -c '\''echo "$1:"; cat "$1"'\'' _ {} \;' > > > > I'll compile the kernel with this and see what happens! > > > > Have you tried to run the test suite from top-level? This is just a > > sanity check. I've few times forgot to do this so thus asking :-) > > > > BR, Jarkko > > > > I added following on > https://github.com/haitaohuang/linux/tree/sgx_cg_upstream_v11_plus > Please update to run from top-level. > > --- a/tools/testing/selftests/sgx/Makefile > +++ b/tools/testing/selftests/sgx/Makefile > @@ -20,7 +20,8 @@ ENCL_LDFLAGS := -Wl,-T,test_encl.lds,--build-id=none > > ifeq ($(CAN_BUILD_X86_64), 1) > TEST_CUSTOM_PROGS := $(OUTPUT)/test_sgx > -TEST_FILES := $(OUTPUT)/test_encl.elf > +TEST_FILES := $(OUTPUT)/test_encl.elf ash_cgexec.sh > +TEST_PROGS := run_epc_cg_selftests.sh > > all: $(TEST_CUSTOM_PROGS) $(OUTPUT)/test_encl.elf > endif > > ... > > index dd56273056fc..ba0451fc16bc 100755 > --- a/tools/testing/selftests/sgx/run_epc_cg_selftests.sh > +++ b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh > @@ -2,6 +2,14 @@ > # SPDX-License-Identifier: GPL-2.0 > # Copyright(c) 2023, 2024 Intel Corporation. > > + > +# Kselftest framework requirement - SKIP code is 4. > +ksft_skip=4 > +if [ "$(id -u)" -ne 0 ]; then > + echo "SKIP: SGX Cgroup tests need root priviledges." > + exit $ksft_skip > +fi > + > TEST_ROOT_CG=selftest > > Thanks > Haitao OK, I'll move this mail to my TODO folder and try to get it tested asap. BR, Jarkko
On Mon, 15 Apr 2024 14:08:44 -0500, Jarkko Sakkinen <jarkko@kernel.org> wrote: > On Mon Apr 15, 2024 at 6:13 AM EEST, Haitao Huang wrote: >> On Sun, 14 Apr 2024 10:01:03 -0500, Jarkko Sakkinen <jarkko@kernel.org> >> wrote: >> >> > On Wed Apr 10, 2024 at 9:25 PM EEST, Haitao Huang wrote: >> >> To run selftests for EPC cgroup: >> >> >> >> sudo ./run_epc_cg_selftests.sh >> >> >> >> To watch misc cgroup 'current' changes during testing, run this in a >> >> separate terminal: >> >> >> >> ./watch_misc_for_tests.sh current >> >> >> >> With different cgroups, the script starts one or multiple concurrent >> SGX >> >> selftests (test_sgx), each to run the unclobbered_vdso_oversubscribed >> >> test case, which loads an enclave of EPC size equal to the EPC >> capacity >> >> available on the platform. The script checks results against the >> >> expectation set for each cgroup and reports success or failure. >> >> >> >> The script creates 3 different cgroups at the beginning with >> following >> >> expectations: >> >> >> >> 1) SMALL - intentionally small enough to fail the test loading an >> >> enclave of size equal to the capacity. >> >> 2) LARGE - large enough to run up to 4 concurrent tests but fail >> some if >> >> more than 4 concurrent tests are run. The script starts 4 expecting >> at >> >> least one test to pass, and then starts 5 expecting at least one test >> >> to fail. >> >> 3) LARGER - limit is the same as the capacity, large enough to run >> lots >> >> of >> >> concurrent tests. The script starts 8 of them and expects all pass. >> >> Then it reruns the same test with one process randomly killed and >> >> usage checked to be zero after all processes exit. >> >> >> >> The script also includes a test with low mem_cg limit and LARGE >> sgx_epc >> >> limit to verify that the RAM used for per-cgroup reclamation is >> charged >> >> to a proper mem_cg. For this test, it turns off swapping before >> start, >> >> and turns swapping back on afterwards. >> >> >> >> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> >> >> --- >> >> V11: >> >> - Remove cgroups-tools dependency and make scripts ash compatible. >> >> (Jarkko) >> >> - Drop support for cgroup v1 and simplify. (Michal, Jarkko) >> >> - Add documentation for functions. (Jarkko) >> >> - Turn off swapping before memcontrol tests and back on after >> >> - Format and style fixes, name for hard coded values >> >> >> >> V7: >> >> - Added memcontrol test. >> >> >> >> V5: >> >> - Added script with automatic results checking, remove the >> interactive >> >> script. >> >> - The script can run independent from the series below. >> >> --- >> >> tools/testing/selftests/sgx/ash_cgexec.sh | 16 + >> >> .../selftests/sgx/run_epc_cg_selftests.sh | 275 >> ++++++++++++++++++ >> >> .../selftests/sgx/watch_misc_for_tests.sh | 11 + >> >> 3 files changed, 302 insertions(+) >> >> create mode 100755 tools/testing/selftests/sgx/ash_cgexec.sh >> >> create mode 100755 >> tools/testing/selftests/sgx/run_epc_cg_selftests.sh >> >> create mode 100755 >> tools/testing/selftests/sgx/watch_misc_for_tests.sh >> >> >> >> diff --git a/tools/testing/selftests/sgx/ash_cgexec.sh >> >> b/tools/testing/selftests/sgx/ash_cgexec.sh >> >> new file mode 100755 >> >> index 000000000000..cfa5d2b0e795 >> >> --- /dev/null >> >> +++ b/tools/testing/selftests/sgx/ash_cgexec.sh >> >> @@ -0,0 +1,16 @@ >> >> +#!/usr/bin/env sh >> >> +# SPDX-License-Identifier: GPL-2.0 >> >> +# Copyright(c) 2024 Intel Corporation. >> >> + >> >> +# Start a program in a given cgroup. >> >> +# Supports V2 cgroup paths, relative to /sys/fs/cgroup >> >> +if [ "$#" -lt 2 ]; then >> >> + echo "Usage: $0 <v2 cgroup path> <command> [args...]" >> >> + exit 1 >> >> +fi >> >> +# Move this shell to the cgroup. >> >> +echo 0 >/sys/fs/cgroup/$1/cgroup.procs >> >> +shift >> >> +# Execute the command within the cgroup >> >> +exec "$@" >> >> + >> >> diff --git a/tools/testing/selftests/sgx/run_epc_cg_selftests.sh >> >> b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh >> >> new file mode 100755 >> >> index 000000000000..dd56273056fc >> >> --- /dev/null >> >> +++ b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh >> >> @@ -0,0 +1,275 @@ >> >> +#!/usr/bin/env sh >> >> +# SPDX-License-Identifier: GPL-2.0 >> >> +# Copyright(c) 2023, 2024 Intel Corporation. >> >> + >> >> +TEST_ROOT_CG=selftest >> >> +TEST_CG_SUB1=$TEST_ROOT_CG/test1 >> >> +TEST_CG_SUB2=$TEST_ROOT_CG/test2 >> >> +# We will only set limit in test1 and run tests in test3 >> >> +TEST_CG_SUB3=$TEST_ROOT_CG/test1/test3 >> >> +TEST_CG_SUB4=$TEST_ROOT_CG/test4 >> >> + >> >> +# Cgroup v2 only >> >> +CG_ROOT=/sys/fs/cgroup >> >> +mkdir -p $CG_ROOT/$TEST_CG_SUB1 >> >> +mkdir -p $CG_ROOT/$TEST_CG_SUB2 >> >> +mkdir -p $CG_ROOT/$TEST_CG_SUB3 >> >> +mkdir -p $CG_ROOT/$TEST_CG_SUB4 >> >> + >> >> +# Turn on misc and memory controller in non-leaf nodes >> >> +echo "+misc" > $CG_ROOT/cgroup.subtree_control && \ >> >> +echo "+memory" > $CG_ROOT/cgroup.subtree_control && \ >> >> +echo "+misc" > $CG_ROOT/$TEST_ROOT_CG/cgroup.subtree_control && \ >> >> +echo "+memory" > $CG_ROOT/$TEST_ROOT_CG/cgroup.subtree_control && \ >> >> +echo "+misc" > $CG_ROOT/$TEST_CG_SUB1/cgroup.subtree_control >> >> +if [ $? -ne 0 ]; then >> >> + echo "# Failed setting up cgroups, make sure misc and memory >> >> cgroups are enabled." >> >> + exit 1 >> >> +fi >> >> + >> >> +CAPACITY=$(grep "sgx_epc" "$CG_ROOT/misc.capacity" | awk '{print >> $2}') >> >> +# This is below number of VA pages needed for enclave of capacity >> >> size. So >> >> +# should fail oversubscribed cases >> >> +SMALL=$(( CAPACITY / 512 )) >> >> + >> >> +# At least load one enclave of capacity size successfully, maybe up >> to >> >> 4. >> >> +# But some may fail if we run more than 4 concurrent enclaves of >> >> capacity size. >> >> +LARGE=$(( SMALL * 4 )) >> >> + >> >> +# Load lots of enclaves >> >> +LARGER=$CAPACITY >> >> +echo "# Setting up limits." >> >> +echo "sgx_epc $SMALL" > $CG_ROOT/$TEST_CG_SUB1/misc.max && \ >> >> +echo "sgx_epc $LARGE" > $CG_ROOT/$TEST_CG_SUB2/misc.max && \ >> >> +echo "sgx_epc $LARGER" > $CG_ROOT/$TEST_CG_SUB4/misc.max >> >> +if [ $? -ne 0 ]; then >> >> + echo "# Failed setting up misc limits." >> >> + exit 1 >> >> +fi >> >> + >> >> +clean_up() >> >> +{ >> >> + sleep 2 >> >> + rmdir $CG_ROOT/$TEST_CG_SUB2 >> >> + rmdir $CG_ROOT/$TEST_CG_SUB3 >> >> + rmdir $CG_ROOT/$TEST_CG_SUB4 >> >> + rmdir $CG_ROOT/$TEST_CG_SUB1 >> >> + rmdir $CG_ROOT/$TEST_ROOT_CG >> >> +} >> >> + >> >> +timestamp=$(date +%Y%m%d_%H%M%S) >> >> + >> >> +test_cmd="./test_sgx -t unclobbered_vdso_oversubscribed" >> >> + >> >> +PROCESS_SUCCESS=1 >> >> +PROCESS_FAILURE=0 >> >> + >> >> +# Wait for a process and check for expected exit status. >> >> +# >> >> +# Arguments: >> >> +# $1 - the pid of the process to wait and check. >> >> +# $2 - 1 if expecting success, 0 for failure. >> >> +# >> >> +# Return: >> >> +# 0 if the exit status of the process matches the expectation. >> >> +# 1 otherwise. >> >> +wait_check_process_status() { >> >> + pid=$1 >> >> + check_for_success=$2 >> >> + >> >> + wait "$pid" >> >> + status=$? >> >> + >> >> + if [ $check_for_success -eq $PROCESS_SUCCESS ] && [ $status -eq >> 0 >> >> ]; then >> >> + echo "# Process $pid succeeded." >> >> + return 0 >> >> + elif [ $check_for_success -eq $PROCESS_FAILURE ] && [ $status >> -ne >> >> 0 ]; then >> >> + echo "# Process $pid returned failure." >> >> + return 0 >> >> + fi >> >> + return 1 >> >> +} >> >> + >> >> +# Wait for a set of processes and check for expected exit status >> >> +# >> >> +# Arguments: >> >> +# $1 - 1 if expecting success, 0 for failure. >> >> +# remaining args - The pids of the processes >> >> +# >> >> +# Return: >> >> +# 0 if exit status of any process matches the expectation. >> >> +# 1 otherwise. >> >> +wait_and_detect_for_any() { >> >> + check_for_success=$1 >> >> + >> >> + shift >> >> + detected=1 # 0 for success detection >> >> + >> >> + for pid in $@; do >> >> + if wait_check_process_status "$pid" "$check_for_success"; >> then >> >> + detected=0 >> >> + # Wait for other processes to exit >> >> + fi >> >> + done >> >> + >> >> + return $detected >> >> +} >> >> + >> >> +echo "# Start unclobbered_vdso_oversubscribed with SMALL limit, >> >> expecting failure..." >> >> +# Always use leaf node of misc cgroups >> >> +# these may fail on OOM >> >> +./ash_cgexec.sh $TEST_CG_SUB3 $test_cmd >cgtest_small_$timestamp.log >> >> 2>&1 >> >> +if [ $? -eq 0 ]; then >> >> + echo "# Fail on SMALL limit, not expecting any test passes." >> >> + clean_up >> >> + exit 1 >> >> +else >> >> + echo "# Test failed as expected." >> >> +fi >> >> + >> >> +echo "# PASSED SMALL limit." >> >> + >> >> +echo "# Start 4 concurrent unclobbered_vdso_oversubscribed tests >> with >> >> LARGE limit, >> >> + expecting at least one success...." >> >> + >> >> +pids="" >> >> +for i in 1 2 3 4; do >> >> + ( >> >> + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >> >> >cgtest_large_positive_$timestamp.$i.log 2>&1 >> >> + ) & >> >> + pids="$pids $!" >> >> +done >> >> + >> >> + >> >> +if wait_and_detect_for_any $PROCESS_SUCCESS "$pids"; then >> >> + echo "# PASSED LARGE limit positive testing." >> >> +else >> >> + echo "# Failed on LARGE limit positive testing, no test passes." >> >> + clean_up >> >> + exit 1 >> >> +fi >> >> + >> >> +echo "# Start 5 concurrent unclobbered_vdso_oversubscribed tests >> with >> >> LARGE limit, >> >> + expecting at least one failure...." >> >> +pids="" >> >> +for i in 1 2 3 4 5; do >> >> + ( >> >> + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >> >> >cgtest_large_negative_$timestamp.$i.log 2>&1 >> >> + ) & >> >> + pids="$pids $!" >> >> +done >> >> + >> >> +if wait_and_detect_for_any $PROCESS_FAILURE "$pids"; then >> >> + echo "# PASSED LARGE limit negative testing." >> >> +else >> >> + echo "# Failed on LARGE limit negative testing, no test fails." >> >> + clean_up >> >> + exit 1 >> >> +fi >> >> + >> >> +echo "# Start 8 concurrent unclobbered_vdso_oversubscribed tests >> with >> >> LARGER limit, >> >> + expecting no failure...." >> >> +pids="" >> >> +for i in 1 2 3 4 5 6 7 8; do >> >> + ( >> >> + ./ash_cgexec.sh $TEST_CG_SUB4 $test_cmd >> >> >cgtest_larger_$timestamp.$i.log 2>&1 >> >> + ) & >> >> + pids="$pids $!" >> >> +done >> >> + >> >> +if wait_and_detect_for_any $PROCESS_FAILURE "$pids"; then >> >> + echo "# Failed on LARGER limit, at least one test fails." >> >> + clean_up >> >> + exit 1 >> >> +else >> >> + echo "# PASSED LARGER limit tests." >> >> +fi >> >> + >> >> +echo "# Start 8 concurrent unclobbered_vdso_oversubscribed tests >> with >> >> LARGER limit, >> >> + randomly kill one, expecting no failure...." >> >> +pids="" >> >> +for i in 1 2 3 4 5 6 7 8; do >> >> + ( >> >> + ./ash_cgexec.sh $TEST_CG_SUB4 $test_cmd >> >> >cgtest_larger_kill_$timestamp.$i.log 2>&1 >> >> + ) & >> >> + pids="$pids $!" >> >> +done >> >> +random_number=$(awk 'BEGIN{srand();print int(rand()*5)}') >> >> +sleep $((random_number + 1)) >> >> + >> >> +# Randomly select a process to kill >> >> +# Make sure usage counter not leaked at the end. >> >> +RANDOM_INDEX=$(awk 'BEGIN{srand();print int(rand()*8)}') >> >> +counter=0 >> >> +for pid in $pids; do >> >> + if [ "$counter" -eq "$RANDOM_INDEX" ]; then >> >> + PID_TO_KILL=$pid >> >> + break >> >> + fi >> >> + counter=$((counter + 1)) >> >> +done >> >> + >> >> +kill $PID_TO_KILL >> >> +echo "# Killed process with PID: $PID_TO_KILL" >> >> + >> >> +any_failure=0 >> >> +for pid in $pids; do >> >> + wait "$pid" >> >> + status=$? >> >> + if [ "$pid" != "$PID_TO_KILL" ]; then >> >> + if [ $status -ne 0 ]; then >> >> + echo "# Process $pid returned failure." >> >> + any_failure=1 >> >> + fi >> >> + fi >> >> +done >> >> + >> >> +if [ $any_failure -ne 0 ]; then >> >> + echo "# Failed on random killing, at least one test fails." >> >> + clean_up >> >> + exit 1 >> >> +fi >> >> +echo "# PASSED LARGER limit test with a process randomly killed." >> >> + >> >> +MEM_LIMIT_TOO_SMALL=$((CAPACITY - 2 * LARGE)) >> >> + >> >> +echo "$MEM_LIMIT_TOO_SMALL" > $CG_ROOT/$TEST_CG_SUB2/memory.max >> >> +if [ $? -ne 0 ]; then >> >> + echo "# Failed creating memory controller." >> >> + clean_up >> >> + exit 1 >> >> +fi >> >> + >> >> +echo "# Start 4 concurrent unclobbered_vdso_oversubscribed tests >> with >> >> LARGE EPC limit, >> >> + and too small RAM limit, expecting all failures...." >> >> +# Ensure swapping off so the OOM killer is activated when mem_cgroup >> >> limit is hit. >> >> +swapoff -a >> >> +pids="" >> >> +for i in 1 2 3 4; do >> >> + ( >> >> + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >> >> >cgtest_large_oom_$timestamp.$i.log 2>&1 >> >> + ) & >> >> + pids="$pids $!" >> >> +done >> >> + >> >> +if wait_and_detect_for_any $PROCESS_SUCCESS "$pids"; then >> >> + echo "# Failed on tests with memcontrol, some tests did not >> fail." >> >> + clean_up >> >> + swapon -a >> >> + exit 1 >> >> +else >> >> + swapon -a >> >> + echo "# PASSED LARGE limit tests with memcontrol." >> >> +fi >> >> + >> >> +sleep 2 >> >> + >> >> +USAGE=$(grep '^sgx_epc' "$CG_ROOT/$TEST_ROOT_CG/misc.current" | awk >> >> '{print $2}') >> >> +if [ "$USAGE" -ne 0 ]; then >> >> + echo "# Failed: Final usage is $USAGE, not 0." >> >> +else >> >> + echo "# PASSED leakage check." >> >> + echo "# PASSED ALL cgroup limit tests, cleanup cgroups..." >> >> +fi >> >> +clean_up >> >> +echo "# done." >> >> diff --git a/tools/testing/selftests/sgx/watch_misc_for_tests.sh >> >> b/tools/testing/selftests/sgx/watch_misc_for_tests.sh >> >> new file mode 100755 >> >> index 000000000000..1c9985726ace >> >> --- /dev/null >> >> +++ b/tools/testing/selftests/sgx/watch_misc_for_tests.sh >> >> @@ -0,0 +1,11 @@ >> >> +#!/usr/bin/env sh >> >> +# SPDX-License-Identifier: GPL-2.0 >> >> +# Copyright(c) 2023, 2024 Intel Corporation. >> >> + >> >> +if [ -z "$1" ]; then >> >> + echo "No argument supplied, please provide 'max', 'current', or >> >> 'events'" >> >> + exit 1 >> >> +fi >> >> + >> >> +watch -n 1 'find /sys/fs/cgroup -wholename "*/test*/misc.'$1'" >> -exec \ >> >> + sh -c '\''echo "$1:"; cat "$1"'\'' _ {} \;' >> > >> > So this is what happens now: >> > >> > jarkko@mustatorvisieni:~/linux-tpmdd> make -C >> > tools/testing/selftests/sgx run_tests >> > make: Entering directory >> > '/home/jarkko/linux-tpmdd/tools/testing/selftests/sgx' >> > TAP version 13 >> > 1..1 >> > # timeout set to 45 >> > # selftests: sgx: test_sgx >> > # TAP version 13 >> > # 1..16 >> > # # Starting 16 tests from 1 test cases. >> > # # RUN enclave.unclobbered_vdso ... >> > # # OK enclave.unclobbered_vdso >> > # ok 1 enclave.unclobbered_vdso >> > # # RUN enclave.unclobbered_vdso_oversubscribed ... >> > # # OK enclave.unclobbered_vdso_oversubscribed >> > # ok 2 enclave.unclobbered_vdso_oversubscribed >> > # # RUN enclave.unclobbered_vdso_oversubscribed_remove ... >> > # # main.c:402:unclobbered_vdso_oversubscribed_remove:Creating an >> > enclave with 98566144 bytes heap may take a while ... >> > # # main.c:457:unclobbered_vdso_oversubscribed_remove:Changing type of >> > 98566144 bytes to trimmed may take a while ... >> > # # main.c:473:unclobbered_vdso_oversubscribed_remove:Entering enclave >> > to run EACCEPT for each page of 98566144 bytes may take a while ... >> > # # main.c:494:unclobbered_vdso_oversubscribed_remove:Removing >> 98566144 >> > bytes from enclave may take a while ... >> > # # OK enclave.unclobbered_vdso_oversubscribed_remove >> > # ok 3 enclave.unclobbered_vdso_oversubscribed_remove >> > # # RUN enclave.clobbered_vdso ... >> > # # OK enclave.clobbered_vdso >> > # ok 4 enclave.clobbered_vdso >> > # # RUN enclave.clobbered_vdso_and_user_function ... >> > # # OK enclave.clobbered_vdso_and_user_function >> > # ok 5 enclave.clobbered_vdso_and_user_function >> > # # RUN enclave.tcs_entry ... >> > # # OK enclave.tcs_entry >> > # ok 6 enclave.tcs_entry >> > # # RUN enclave.pte_permissions ... >> > # # OK enclave.pte_permissions >> > # ok 7 enclave.pte_permissions >> > # # RUN enclave.tcs_permissions ... >> > # # OK enclave.tcs_permissions >> > # ok 8 enclave.tcs_permissions >> > # # RUN enclave.epcm_permissions ... >> > # # OK enclave.epcm_permissions >> > # ok 9 enclave.epcm_permissions >> > # # RUN enclave.augment ... >> > # # OK enclave.augment >> > # ok 10 enclave.augment >> > # # RUN enclave.augment_via_eaccept ... >> > # # OK enclave.augment_via_eaccept >> > # ok 11 enclave.augment_via_eaccept >> > # # RUN enclave.tcs_create ... >> > # # OK enclave.tcs_create >> > # ok 12 enclave.tcs_create >> > # # RUN enclave.remove_added_page_no_eaccept ... >> > # # OK enclave.remove_added_page_no_eaccept >> > # ok 13 enclave.remove_added_page_no_eaccept >> > # # RUN enclave.remove_added_page_invalid_access ... >> > # # OK enclave.remove_added_page_invalid_access >> > # ok 14 enclave.remove_added_page_invalid_access >> > # # RUN >> > enclave.remove_added_page_invalid_access_after_eaccept ... >> > # # OK >> enclave.remove_added_page_invalid_access_after_eaccept >> > # ok 15 enclave.remove_added_page_invalid_access_after_eaccept >> > # # RUN enclave.remove_untouched_page ... >> > # # OK enclave.remove_untouched_page >> > # ok 16 enclave.remove_untouched_page >> > # # PASSED: 16 / 16 tests passed. >> > # # Totals: pass:16 fail:0 xfail:0 xpass:0 skip:0 error:0 >> > ok 1 selftests: sgx: test_sgx >> > >> > Also cgroups tests are expected to run when invoking "run_tests". >> > >> I can add the SGX cgroup tests to the "run_tests" target. But it will >> need >> more than the 45secs given by the default timeout especially on >> platforms >> with larger EPC. You will need run with --override-timeout >> (current SGX selftests also timeout for default 45 sec on a server with >> 4G >> EPC). >> >> >> > I also wonder do we really want this: >> > >> > diff --git a/config/x86_64/default b/config/x86_64/default >> > index 246b1f3df46..6c3d20af7b4 100644 >> > --- a/config/x86_64/default >> > +++ b/config/x86_64/default >> > @@ -502,6 +502,7 @@ CONFIG_X86_INTEL_TSX_MODE_OFF=y >> > # CONFIG_X86_INTEL_TSX_MODE_ON is not set >> > # CONFIG_X86_INTEL_TSX_MODE_AUTO is not set >> > CONFIG_X86_SGX=y >> > +CONFIG_CGROUP_SGX_EPC=y >> > # CONFIG_X86_USER_SHADOW_STACK is not set >> > CONFIG_EFI=y >> > CONFIG_EFI_STUB=y >> > >> > It is a small change but affects all of downstream, not just OpenSUSE. >> > >> > I hard time projecting a situation where you wanted SGX but without >> > cgroups support so perhaps it would be a better idea to enable cgroups >> > unconditionally when SGX and cgroups are part of the kernel config? >> > >> > Then downstream can just pick the patches and call it a day... >> > >> > BR, Jarkko >> > >> >> I don't have issue to remove this config and conditionally compile in >> SGX >> cgroup implementation when MISC is configured. > > I did run the basic test by manually creating the cgroup so you could > add tested-by from my side to the other kernel patches expect this one > :-) Thanks > > I've reviewed it enough rounds and given various code suggestions etc. > For me it is "good enough" or has been for a while. I just want this > test to work so that people doing kernel QA will automatically get it > to their testing cycle. That is why proper integration to kselftest > framework is a must. > > BR, Jarkko > Sounds good. I have added integration now on tip of the staging/plus branch: https://github.com/haitaohuang/linux/tree/sgx_cg_upstream_v11_plus Will send v12 with those and with the change to remove CGROUP_SGX_EPC config. Thanks Haitao
Hi Jarkko On Mon, 15 Apr 2024 14:08:44 -0500, Jarkko Sakkinen <jarkko@kernel.org> wrote: > I did run the basic test by manually creating the cgroup so you could > add tested-by from my side to the other kernel patches expect this one > > I've reviewed it enough rounds and given various code suggestions etc. > For me it is "good enough" or has been for a while. I just want this > test to work so that people doing kernel QA will automatically get it > to their testing cycle. That is why proper integration to kselftest > framework is a must May I have your "Reviewed-by" tag also for the patches #8-13? Not sure if I missed any other comments/issue you raised. I think all are addressed in v12. I will refine the test scripts (patch #14) in v13. Thanks Haitao
diff --git a/tools/testing/selftests/sgx/ash_cgexec.sh b/tools/testing/selftests/sgx/ash_cgexec.sh new file mode 100755 index 000000000000..cfa5d2b0e795 --- /dev/null +++ b/tools/testing/selftests/sgx/ash_cgexec.sh @@ -0,0 +1,16 @@ +#!/usr/bin/env sh +# SPDX-License-Identifier: GPL-2.0 +# Copyright(c) 2024 Intel Corporation. + +# Start a program in a given cgroup. +# Supports V2 cgroup paths, relative to /sys/fs/cgroup +if [ "$#" -lt 2 ]; then + echo "Usage: $0 <v2 cgroup path> <command> [args...]" + exit 1 +fi +# Move this shell to the cgroup. +echo 0 >/sys/fs/cgroup/$1/cgroup.procs +shift +# Execute the command within the cgroup +exec "$@" + diff --git a/tools/testing/selftests/sgx/run_epc_cg_selftests.sh b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh new file mode 100755 index 000000000000..dd56273056fc --- /dev/null +++ b/tools/testing/selftests/sgx/run_epc_cg_selftests.sh @@ -0,0 +1,275 @@ +#!/usr/bin/env sh +# SPDX-License-Identifier: GPL-2.0 +# Copyright(c) 2023, 2024 Intel Corporation. + +TEST_ROOT_CG=selftest +TEST_CG_SUB1=$TEST_ROOT_CG/test1 +TEST_CG_SUB2=$TEST_ROOT_CG/test2 +# We will only set limit in test1 and run tests in test3 +TEST_CG_SUB3=$TEST_ROOT_CG/test1/test3 +TEST_CG_SUB4=$TEST_ROOT_CG/test4 + +# Cgroup v2 only +CG_ROOT=/sys/fs/cgroup +mkdir -p $CG_ROOT/$TEST_CG_SUB1 +mkdir -p $CG_ROOT/$TEST_CG_SUB2 +mkdir -p $CG_ROOT/$TEST_CG_SUB3 +mkdir -p $CG_ROOT/$TEST_CG_SUB4 + +# Turn on misc and memory controller in non-leaf nodes +echo "+misc" > $CG_ROOT/cgroup.subtree_control && \ +echo "+memory" > $CG_ROOT/cgroup.subtree_control && \ +echo "+misc" > $CG_ROOT/$TEST_ROOT_CG/cgroup.subtree_control && \ +echo "+memory" > $CG_ROOT/$TEST_ROOT_CG/cgroup.subtree_control && \ +echo "+misc" > $CG_ROOT/$TEST_CG_SUB1/cgroup.subtree_control +if [ $? -ne 0 ]; then + echo "# Failed setting up cgroups, make sure misc and memory cgroups are enabled." + exit 1 +fi + +CAPACITY=$(grep "sgx_epc" "$CG_ROOT/misc.capacity" | awk '{print $2}') +# This is below number of VA pages needed for enclave of capacity size. So +# should fail oversubscribed cases +SMALL=$(( CAPACITY / 512 )) + +# At least load one enclave of capacity size successfully, maybe up to 4. +# But some may fail if we run more than 4 concurrent enclaves of capacity size. +LARGE=$(( SMALL * 4 )) + +# Load lots of enclaves +LARGER=$CAPACITY +echo "# Setting up limits." +echo "sgx_epc $SMALL" > $CG_ROOT/$TEST_CG_SUB1/misc.max && \ +echo "sgx_epc $LARGE" > $CG_ROOT/$TEST_CG_SUB2/misc.max && \ +echo "sgx_epc $LARGER" > $CG_ROOT/$TEST_CG_SUB4/misc.max +if [ $? -ne 0 ]; then + echo "# Failed setting up misc limits." + exit 1 +fi + +clean_up() +{ + sleep 2 + rmdir $CG_ROOT/$TEST_CG_SUB2 + rmdir $CG_ROOT/$TEST_CG_SUB3 + rmdir $CG_ROOT/$TEST_CG_SUB4 + rmdir $CG_ROOT/$TEST_CG_SUB1 + rmdir $CG_ROOT/$TEST_ROOT_CG +} + +timestamp=$(date +%Y%m%d_%H%M%S) + +test_cmd="./test_sgx -t unclobbered_vdso_oversubscribed" + +PROCESS_SUCCESS=1 +PROCESS_FAILURE=0 + +# Wait for a process and check for expected exit status. +# +# Arguments: +# $1 - the pid of the process to wait and check. +# $2 - 1 if expecting success, 0 for failure. +# +# Return: +# 0 if the exit status of the process matches the expectation. +# 1 otherwise. +wait_check_process_status() { + pid=$1 + check_for_success=$2 + + wait "$pid" + status=$? + + if [ $check_for_success -eq $PROCESS_SUCCESS ] && [ $status -eq 0 ]; then + echo "# Process $pid succeeded." + return 0 + elif [ $check_for_success -eq $PROCESS_FAILURE ] && [ $status -ne 0 ]; then + echo "# Process $pid returned failure." + return 0 + fi + return 1 +} + +# Wait for a set of processes and check for expected exit status +# +# Arguments: +# $1 - 1 if expecting success, 0 for failure. +# remaining args - The pids of the processes +# +# Return: +# 0 if exit status of any process matches the expectation. +# 1 otherwise. +wait_and_detect_for_any() { + check_for_success=$1 + + shift + detected=1 # 0 for success detection + + for pid in $@; do + if wait_check_process_status "$pid" "$check_for_success"; then + detected=0 + # Wait for other processes to exit + fi + done + + return $detected +} + +echo "# Start unclobbered_vdso_oversubscribed with SMALL limit, expecting failure..." +# Always use leaf node of misc cgroups +# these may fail on OOM +./ash_cgexec.sh $TEST_CG_SUB3 $test_cmd >cgtest_small_$timestamp.log 2>&1 +if [ $? -eq 0 ]; then + echo "# Fail on SMALL limit, not expecting any test passes." + clean_up + exit 1 +else + echo "# Test failed as expected." +fi + +echo "# PASSED SMALL limit." + +echo "# Start 4 concurrent unclobbered_vdso_oversubscribed tests with LARGE limit, + expecting at least one success...." + +pids="" +for i in 1 2 3 4; do + ( + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >cgtest_large_positive_$timestamp.$i.log 2>&1 + ) & + pids="$pids $!" +done + + +if wait_and_detect_for_any $PROCESS_SUCCESS "$pids"; then + echo "# PASSED LARGE limit positive testing." +else + echo "# Failed on LARGE limit positive testing, no test passes." + clean_up + exit 1 +fi + +echo "# Start 5 concurrent unclobbered_vdso_oversubscribed tests with LARGE limit, + expecting at least one failure...." +pids="" +for i in 1 2 3 4 5; do + ( + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >cgtest_large_negative_$timestamp.$i.log 2>&1 + ) & + pids="$pids $!" +done + +if wait_and_detect_for_any $PROCESS_FAILURE "$pids"; then + echo "# PASSED LARGE limit negative testing." +else + echo "# Failed on LARGE limit negative testing, no test fails." + clean_up + exit 1 +fi + +echo "# Start 8 concurrent unclobbered_vdso_oversubscribed tests with LARGER limit, + expecting no failure...." +pids="" +for i in 1 2 3 4 5 6 7 8; do + ( + ./ash_cgexec.sh $TEST_CG_SUB4 $test_cmd >cgtest_larger_$timestamp.$i.log 2>&1 + ) & + pids="$pids $!" +done + +if wait_and_detect_for_any $PROCESS_FAILURE "$pids"; then + echo "# Failed on LARGER limit, at least one test fails." + clean_up + exit 1 +else + echo "# PASSED LARGER limit tests." +fi + +echo "# Start 8 concurrent unclobbered_vdso_oversubscribed tests with LARGER limit, + randomly kill one, expecting no failure...." +pids="" +for i in 1 2 3 4 5 6 7 8; do + ( + ./ash_cgexec.sh $TEST_CG_SUB4 $test_cmd >cgtest_larger_kill_$timestamp.$i.log 2>&1 + ) & + pids="$pids $!" +done +random_number=$(awk 'BEGIN{srand();print int(rand()*5)}') +sleep $((random_number + 1)) + +# Randomly select a process to kill +# Make sure usage counter not leaked at the end. +RANDOM_INDEX=$(awk 'BEGIN{srand();print int(rand()*8)}') +counter=0 +for pid in $pids; do + if [ "$counter" -eq "$RANDOM_INDEX" ]; then + PID_TO_KILL=$pid + break + fi + counter=$((counter + 1)) +done + +kill $PID_TO_KILL +echo "# Killed process with PID: $PID_TO_KILL" + +any_failure=0 +for pid in $pids; do + wait "$pid" + status=$? + if [ "$pid" != "$PID_TO_KILL" ]; then + if [ $status -ne 0 ]; then + echo "# Process $pid returned failure." + any_failure=1 + fi + fi +done + +if [ $any_failure -ne 0 ]; then + echo "# Failed on random killing, at least one test fails." + clean_up + exit 1 +fi +echo "# PASSED LARGER limit test with a process randomly killed." + +MEM_LIMIT_TOO_SMALL=$((CAPACITY - 2 * LARGE)) + +echo "$MEM_LIMIT_TOO_SMALL" > $CG_ROOT/$TEST_CG_SUB2/memory.max +if [ $? -ne 0 ]; then + echo "# Failed creating memory controller." + clean_up + exit 1 +fi + +echo "# Start 4 concurrent unclobbered_vdso_oversubscribed tests with LARGE EPC limit, + and too small RAM limit, expecting all failures...." +# Ensure swapping off so the OOM killer is activated when mem_cgroup limit is hit. +swapoff -a +pids="" +for i in 1 2 3 4; do + ( + ./ash_cgexec.sh $TEST_CG_SUB2 $test_cmd >cgtest_large_oom_$timestamp.$i.log 2>&1 + ) & + pids="$pids $!" +done + +if wait_and_detect_for_any $PROCESS_SUCCESS "$pids"; then + echo "# Failed on tests with memcontrol, some tests did not fail." + clean_up + swapon -a + exit 1 +else + swapon -a + echo "# PASSED LARGE limit tests with memcontrol." +fi + +sleep 2 + +USAGE=$(grep '^sgx_epc' "$CG_ROOT/$TEST_ROOT_CG/misc.current" | awk '{print $2}') +if [ "$USAGE" -ne 0 ]; then + echo "# Failed: Final usage is $USAGE, not 0." +else + echo "# PASSED leakage check." + echo "# PASSED ALL cgroup limit tests, cleanup cgroups..." +fi +clean_up +echo "# done." diff --git a/tools/testing/selftests/sgx/watch_misc_for_tests.sh b/tools/testing/selftests/sgx/watch_misc_for_tests.sh new file mode 100755 index 000000000000..1c9985726ace --- /dev/null +++ b/tools/testing/selftests/sgx/watch_misc_for_tests.sh @@ -0,0 +1,11 @@ +#!/usr/bin/env sh +# SPDX-License-Identifier: GPL-2.0 +# Copyright(c) 2023, 2024 Intel Corporation. + +if [ -z "$1" ]; then + echo "No argument supplied, please provide 'max', 'current', or 'events'" + exit 1 +fi + +watch -n 1 'find /sys/fs/cgroup -wholename "*/test*/misc.'$1'" -exec \ + sh -c '\''echo "$1:"; cat "$1"'\'' _ {} \;'
To run selftests for EPC cgroup: sudo ./run_epc_cg_selftests.sh To watch misc cgroup 'current' changes during testing, run this in a separate terminal: ./watch_misc_for_tests.sh current With different cgroups, the script starts one or multiple concurrent SGX selftests (test_sgx), each to run the unclobbered_vdso_oversubscribed test case, which loads an enclave of EPC size equal to the EPC capacity available on the platform. The script checks results against the expectation set for each cgroup and reports success or failure. The script creates 3 different cgroups at the beginning with following expectations: 1) SMALL - intentionally small enough to fail the test loading an enclave of size equal to the capacity. 2) LARGE - large enough to run up to 4 concurrent tests but fail some if more than 4 concurrent tests are run. The script starts 4 expecting at least one test to pass, and then starts 5 expecting at least one test to fail. 3) LARGER - limit is the same as the capacity, large enough to run lots of concurrent tests. The script starts 8 of them and expects all pass. Then it reruns the same test with one process randomly killed and usage checked to be zero after all processes exit. The script also includes a test with low mem_cg limit and LARGE sgx_epc limit to verify that the RAM used for per-cgroup reclamation is charged to a proper mem_cg. For this test, it turns off swapping before start, and turns swapping back on afterwards. Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> --- V11: - Remove cgroups-tools dependency and make scripts ash compatible. (Jarkko) - Drop support for cgroup v1 and simplify. (Michal, Jarkko) - Add documentation for functions. (Jarkko) - Turn off swapping before memcontrol tests and back on after - Format and style fixes, name for hard coded values V7: - Added memcontrol test. V5: - Added script with automatic results checking, remove the interactive script. - The script can run independent from the series below. --- tools/testing/selftests/sgx/ash_cgexec.sh | 16 + .../selftests/sgx/run_epc_cg_selftests.sh | 275 ++++++++++++++++++ .../selftests/sgx/watch_misc_for_tests.sh | 11 + 3 files changed, 302 insertions(+) create mode 100755 tools/testing/selftests/sgx/ash_cgexec.sh create mode 100755 tools/testing/selftests/sgx/run_epc_cg_selftests.sh create mode 100755 tools/testing/selftests/sgx/watch_misc_for_tests.sh