Message ID | 283f09a5-33bd-eac3-bdfd-83d775045bf9@linux.intel.com (mailing list archive) |
---|---|
Headers | show |
Series | Introduce CAP_SYS_PERFMON capability for secure Perf users groups | expand |
On 12/5/2019 8:15 AM, Alexey Budankov wrote: > Currently access to perf_events functionality [1] beyond the scope permitted > by perf_event_paranoid [1] kernel setting is allowed to a privileged process > [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3]. > > This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance > monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its > governing role for perf_events based performance monitoring of a system. > > CAP_SYS_PERFMON aims to harden system security and integrity when monitoring > performance using perf_events subsystem by processes and Perf privileged users > [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN > privileged processes [3]. Are there use cases where you would need CAP_SYS_PERFMON where you would not also need CAP_SYS_ADMIN? If you separate a new capability from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction with the new capability it is all rather pointless. The scope you've defined for this CAP_SYS_PERFMON is very small. Is there a larger set of privilege checks that might be applicable for it? > > CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to > performance monitoring functionality of perf_events and balance amount of > CAP_SYS_ADMIN credentials in accordance with the recommendations provided in > the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded; > see Notes to kernel developers, below." > > For backward compatibility reasons performance monitoring functionality of > perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for > secure performance monitoring use cases is discouraged with respect to the > introduced CAP_SYS_PERFMON capability. > > In the suggested implementation CAP_SYS_PERFMON enables Perf privileged users > [2] to conduct secure performance monitoring using perf_events in the scope > of available online CPUs when executing code in kernel and user modes. > > Possible alternative solution to this capabilities balancing, system security > hardening task could be to use the existing CAP_SYS_PTRACE capability to govern > perf_events' performance monitoring functionality, since process debugging is > similar to performance monitoring with respect to providing insights into > process memory and execution details. However CAP_SYS_PTRACE still provides > users with more credentials than are required for secure performance monitoring > using perf_events subsystem and this excess is avoided by using the dedicated > CAP_SYS_PERFMON capability. > > libcap library utilities [4], [5] and Perf tool can be used to apply > CAP_SYS_PERFMON capability for secure performance monitoring beyond the scope > permitted by system wide perf_event_paranoid kernel setting and below are the > steps to evaluate the advancement suggested by the patch set: > > - patch, build and boot the kernel > - patch, build Perf tool e.g. to /home/user/perf > ... > # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap > # pushd libcap > # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1/3] > # make > # pushd progs > # ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf > # ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf > /home/user/perf: OK > # ./getcap /home/user/perf > /home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep > # echo 2 > /proc/sys/kernel/perf_event_paranoid > # cat /proc/sys/kernel/perf_event_paranoid > 2 > ... > $ /home/user/perf top > ... works as expected ... > $ cat /proc/`pidof perf`/status > Name: perf > Umask: 0002 > State: S (sleeping) > Tgid: 2958 > Ngid: 0 > Pid: 2958 > PPid: 9847 > TracerPid: 0 > Uid: 500 500 500 500 > Gid: 500 500 500 500 > FDSize: 256 > ... > CapInh: 0000000000000000 > CapPrm: 0000004400080000 > CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000 > cap_sys_perfmon,cap_sys_ptrace,cap_syslog > CapBnd: 0000007fffffffff > CapAmb: 0000000000000000 > NoNewPrivs: 0 > Seccomp: 0 > Speculation_Store_Bypass: thread vulnerable > Cpus_allowed: ff > Cpus_allowed_list: 0-7 > ... > > Usage of cap_sys_perfmon effectively avoids unused credentials excess: > - with cap_sys_admin: > CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111 > - with cap_sys_perfmon: > CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000 > 38 34 19 > sys_perfmon syslog sys_ptrace > > The patch set is for tip perf/core repository: > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core > tip sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6 > > [1] http://man7.org/linux/man-pages/man2/perf_event_open.2.html > [2] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html > [3] http://man7.org/linux/man-pages/man7/capabilities.7.html > [4] http://man7.org/linux/man-pages/man8/setcap.8.html > [5] https://git.kernel.org/pub/scm/libs/libcap/libcap.git > [6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf > > --- > Alexey Budankov (3): > capabilities: introduce CAP_SYS_PERFMON to kernel and user space > perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring > perf tool: extend Perf tool with CAP_SYS_PERFMON support > > include/linux/perf_event.h | 6 ++++-- > include/uapi/linux/capability.h | 10 +++++++++- > security/selinux/include/classmap.h | 4 ++-- > tools/perf/design.txt | 3 ++- > tools/perf/util/cap.h | 4 ++++ > tools/perf/util/evsel.c | 10 +++++----- > tools/perf/util/util.c | 15 +++++++++++++-- > 7 files changed, 39 insertions(+), 13 deletions(-) >
Hello Casey, On 05.12.2019 19:49, Casey Schaufler wrote: > On 12/5/2019 8:15 AM, Alexey Budankov wrote: >> Currently access to perf_events functionality [1] beyond the scope permitted >> by perf_event_paranoid [1] kernel setting is allowed to a privileged process >> [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3]. >> >> This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance >> monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its >> governing role for perf_events based performance monitoring of a system. >> >> CAP_SYS_PERFMON aims to harden system security and integrity when monitoring >> performance using perf_events subsystem by processes and Perf privileged users >> [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN >> privileged processes [3]. > > Are there use cases where you would need CAP_SYS_PERFMON where you > would not also need CAP_SYS_ADMIN? If you separate a new capability Actually, there are. Perf tool that has record, stat and top modes could run with CAP_SYS_PERFMON capability as mentioned below and provide system wide performance data. Currently for that to work the tool needs to be granted with CAP_SYS_ADMIN. > from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction > with the new capability it is all rather pointless. > > The scope you've defined for this CAP_SYS_PERFMON is very small. > Is there a larger set of privilege checks that might be applicable > for it? CAP_SYS_PERFMON could be applied broadly, though, this patch set enables record and stat mode use cases for system wide performance monitoring in kernel and user modes. Thanks, Alexey > > >> >> CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to >> performance monitoring functionality of perf_events and balance amount of >> CAP_SYS_ADMIN credentials in accordance with the recommendations provided in >> the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded; >> see Notes to kernel developers, below." >> >> For backward compatibility reasons performance monitoring functionality of >> perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for >> secure performance monitoring use cases is discouraged with respect to the >> introduced CAP_SYS_PERFMON capability. >> >> In the suggested implementation CAP_SYS_PERFMON enables Perf privileged users >> [2] to conduct secure performance monitoring using perf_events in the scope >> of available online CPUs when executing code in kernel and user modes. >> >> Possible alternative solution to this capabilities balancing, system security >> hardening task could be to use the existing CAP_SYS_PTRACE capability to govern >> perf_events' performance monitoring functionality, since process debugging is >> similar to performance monitoring with respect to providing insights into >> process memory and execution details. However CAP_SYS_PTRACE still provides >> users with more credentials than are required for secure performance monitoring >> using perf_events subsystem and this excess is avoided by using the dedicated >> CAP_SYS_PERFMON capability. >> >> libcap library utilities [4], [5] and Perf tool can be used to apply >> CAP_SYS_PERFMON capability for secure performance monitoring beyond the scope >> permitted by system wide perf_event_paranoid kernel setting and below are the >> steps to evaluate the advancement suggested by the patch set: >> >> - patch, build and boot the kernel >> - patch, build Perf tool e.g. to /home/user/perf >> ... >> # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap >> # pushd libcap >> # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1/3] >> # make >> # pushd progs >> # ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf >> # ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf >> /home/user/perf: OK >> # ./getcap /home/user/perf >> /home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep >> # echo 2 > /proc/sys/kernel/perf_event_paranoid >> # cat /proc/sys/kernel/perf_event_paranoid >> 2 >> ... >> $ /home/user/perf top >> ... works as expected ... >> $ cat /proc/`pidof perf`/status >> Name: perf >> Umask: 0002 >> State: S (sleeping) >> Tgid: 2958 >> Ngid: 0 >> Pid: 2958 >> PPid: 9847 >> TracerPid: 0 >> Uid: 500 500 500 500 >> Gid: 500 500 500 500 >> FDSize: 256 >> ... >> CapInh: 0000000000000000 >> CapPrm: 0000004400080000 >> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000 >> cap_sys_perfmon,cap_sys_ptrace,cap_syslog >> CapBnd: 0000007fffffffff >> CapAmb: 0000000000000000 >> NoNewPrivs: 0 >> Seccomp: 0 >> Speculation_Store_Bypass: thread vulnerable >> Cpus_allowed: ff >> Cpus_allowed_list: 0-7 >> ... >> >> Usage of cap_sys_perfmon effectively avoids unused credentials excess: >> - with cap_sys_admin: >> CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111 >> - with cap_sys_perfmon: >> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000 >> 38 34 19 >> sys_perfmon syslog sys_ptrace >> >> The patch set is for tip perf/core repository: >> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core >> tip sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6 >> >> [1] http://man7.org/linux/man-pages/man2/perf_event_open.2.html >> [2] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html >> [3] http://man7.org/linux/man-pages/man7/capabilities.7.html >> [4] http://man7.org/linux/man-pages/man8/setcap.8.html >> [5] https://git.kernel.org/pub/scm/libs/libcap/libcap.git >> [6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf >> >> --- >> Alexey Budankov (3): >> capabilities: introduce CAP_SYS_PERFMON to kernel and user space >> perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring >> perf tool: extend Perf tool with CAP_SYS_PERFMON support >> >> include/linux/perf_event.h | 6 ++++-- >> include/uapi/linux/capability.h | 10 +++++++++- >> security/selinux/include/classmap.h | 4 ++-- >> tools/perf/design.txt | 3 ++- >> tools/perf/util/cap.h | 4 ++++ >> tools/perf/util/evsel.c | 10 +++++----- >> tools/perf/util/util.c | 15 +++++++++++++-- >> 7 files changed, 39 insertions(+), 13 deletions(-) >> > >
On 12/5/2019 9:05 AM, Alexey Budankov wrote: > Hello Casey, > > On 05.12.2019 19:49, Casey Schaufler wrote: >> On 12/5/2019 8:15 AM, Alexey Budankov wrote: >>> Currently access to perf_events functionality [1] beyond the scope permitted >>> by perf_event_paranoid [1] kernel setting is allowed to a privileged process >>> [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3]. >>> >>> This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance >>> monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its >>> governing role for perf_events based performance monitoring of a system. >>> >>> CAP_SYS_PERFMON aims to harden system security and integrity when monitoring >>> performance using perf_events subsystem by processes and Perf privileged users >>> [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN >>> privileged processes [3]. >> Are there use cases where you would need CAP_SYS_PERFMON where you >> would not also need CAP_SYS_ADMIN? If you separate a new capability > Actually, there are. Perf tool that has record, stat and top modes could run with > CAP_SYS_PERFMON capability as mentioned below and provide system wide performance > data. Currently for that to work the tool needs to be granted with CAP_SYS_ADMIN. The question isn't whether the tool could use the capability, it's whether the tool would also need CAP_SYS_ADMIN to be useful. Are there existing tools that could stop using CAP_SYS_ADMIN in favor of CAP_SYS_PERFMON? My bet is that any tool that does performance monitoring is going to need CAP_SYS_ADMIN for other reasons. > >> from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction >> with the new capability it is all rather pointless. >> >> The scope you've defined for this CAP_SYS_PERFMON is very small. >> Is there a larger set of privilege checks that might be applicable >> for it? > CAP_SYS_PERFMON could be applied broadly, though, this patch set enables record > and stat mode use cases for system wide performance monitoring in kernel and > user modes. The granularity of capabilities is something we have to watch very carefully. Sure, CAP_SYS_ADMIN covers a lot of things, but if we broke it up "properly" we'd have hundreds of capabilities. If you want control that finely we have SELinux. > > Thanks, > Alexey > >> >> >>> CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to >>> performance monitoring functionality of perf_events and balance amount of >>> CAP_SYS_ADMIN credentials in accordance with the recommendations provided in >>> the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded; >>> see Notes to kernel developers, below." >>> >>> For backward compatibility reasons performance monitoring functionality of >>> perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for >>> secure performance monitoring use cases is discouraged with respect to the >>> introduced CAP_SYS_PERFMON capability. >>> >>> In the suggested implementation CAP_SYS_PERFMON enables Perf privileged users >>> [2] to conduct secure performance monitoring using perf_events in the scope >>> of available online CPUs when executing code in kernel and user modes. >>> >>> Possible alternative solution to this capabilities balancing, system security >>> hardening task could be to use the existing CAP_SYS_PTRACE capability to govern >>> perf_events' performance monitoring functionality, since process debugging is >>> similar to performance monitoring with respect to providing insights into >>> process memory and execution details. However CAP_SYS_PTRACE still provides >>> users with more credentials than are required for secure performance monitoring >>> using perf_events subsystem and this excess is avoided by using the dedicated >>> CAP_SYS_PERFMON capability. >>> >>> libcap library utilities [4], [5] and Perf tool can be used to apply >>> CAP_SYS_PERFMON capability for secure performance monitoring beyond the scope >>> permitted by system wide perf_event_paranoid kernel setting and below are the >>> steps to evaluate the advancement suggested by the patch set: >>> >>> - patch, build and boot the kernel >>> - patch, build Perf tool e.g. to /home/user/perf >>> ... >>> # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap >>> # pushd libcap >>> # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1/3] >>> # make >>> # pushd progs >>> # ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf >>> # ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf >>> /home/user/perf: OK >>> # ./getcap /home/user/perf >>> /home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep >>> # echo 2 > /proc/sys/kernel/perf_event_paranoid >>> # cat /proc/sys/kernel/perf_event_paranoid >>> 2 >>> ... >>> $ /home/user/perf top >>> ... works as expected ... >>> $ cat /proc/`pidof perf`/status >>> Name: perf >>> Umask: 0002 >>> State: S (sleeping) >>> Tgid: 2958 >>> Ngid: 0 >>> Pid: 2958 >>> PPid: 9847 >>> TracerPid: 0 >>> Uid: 500 500 500 500 >>> Gid: 500 500 500 500 >>> FDSize: 256 >>> ... >>> CapInh: 0000000000000000 >>> CapPrm: 0000004400080000 >>> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000 >>> cap_sys_perfmon,cap_sys_ptrace,cap_syslog >>> CapBnd: 0000007fffffffff >>> CapAmb: 0000000000000000 >>> NoNewPrivs: 0 >>> Seccomp: 0 >>> Speculation_Store_Bypass: thread vulnerable >>> Cpus_allowed: ff >>> Cpus_allowed_list: 0-7 >>> ... >>> >>> Usage of cap_sys_perfmon effectively avoids unused credentials excess: >>> - with cap_sys_admin: >>> CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111 >>> - with cap_sys_perfmon: >>> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000 >>> 38 34 19 >>> sys_perfmon syslog sys_ptrace >>> >>> The patch set is for tip perf/core repository: >>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core >>> tip sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6 >>> >>> [1] http://man7.org/linux/man-pages/man2/perf_event_open.2.html >>> [2] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html >>> [3] http://man7.org/linux/man-pages/man7/capabilities.7.html >>> [4] http://man7.org/linux/man-pages/man8/setcap.8.html >>> [5] https://git.kernel.org/pub/scm/libs/libcap/libcap.git >>> [6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf >>> >>> --- >>> Alexey Budankov (3): >>> capabilities: introduce CAP_SYS_PERFMON to kernel and user space >>> perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring >>> perf tool: extend Perf tool with CAP_SYS_PERFMON support >>> >>> include/linux/perf_event.h | 6 ++++-- >>> include/uapi/linux/capability.h | 10 +++++++++- >>> security/selinux/include/classmap.h | 4 ++-- >>> tools/perf/design.txt | 3 ++- >>> tools/perf/util/cap.h | 4 ++++ >>> tools/perf/util/evsel.c | 10 +++++----- >>> tools/perf/util/util.c | 15 +++++++++++++-- >>> 7 files changed, 39 insertions(+), 13 deletions(-) >>> >>
> The question isn't whether the tool could use the capability, it's whether > the tool would also need CAP_SYS_ADMIN to be useful. Are there existing > tools that could stop using CAP_SYS_ADMIN in favor of CAP_SYS_PERFMON? > My bet is that any tool that does performance monitoring is going to need > CAP_SYS_ADMIN for other reasons. At least perf stat won't. -Andi
On 05.12.2019 20:33, Casey Schaufler wrote: > On 12/5/2019 9:05 AM, Alexey Budankov wrote: >> Hello Casey, >> >> On 05.12.2019 19:49, Casey Schaufler wrote: >>> On 12/5/2019 8:15 AM, Alexey Budankov wrote: >>>> Currently access to perf_events functionality [1] beyond the scope permitted >>>> by perf_event_paranoid [1] kernel setting is allowed to a privileged process >>>> [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3]. >>>> >>>> This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance >>>> monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its >>>> governing role for perf_events based performance monitoring of a system. >>>> >>>> CAP_SYS_PERFMON aims to harden system security and integrity when monitoring >>>> performance using perf_events subsystem by processes and Perf privileged users >>>> [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN >>>> privileged processes [3]. >>> Are there use cases where you would need CAP_SYS_PERFMON where you >>> would not also need CAP_SYS_ADMIN? If you separate a new capability >> Actually, there are. Perf tool that has record, stat and top modes could run with >> CAP_SYS_PERFMON capability as mentioned below and provide system wide performance >> data. Currently for that to work the tool needs to be granted with CAP_SYS_ADMIN. > > The question isn't whether the tool could use the capability, it's whether > the tool would also need CAP_SYS_ADMIN to be useful. Are there existing > tools that could stop using CAP_SYS_ADMIN in favor of CAP_SYS_PERFMON? > My bet is that any tool that does performance monitoring is going to need > CAP_SYS_ADMIN for other reasons. Yes, sorry. The tool is perf tool (part of kernel tree). If its binary is granted CAP_SYS_ADMIN capability then the tool can collect performance data in system wide mode for some group of unprivileged users. This patch allows replacing CAP_SYS_ADMIN by CAP_SYS_PERFMON e.g. for perf tool and then the tool being granted CAP_SYS_PERFMON could still provide performance data in system wide scope for the same group of unprivileged users. Hope it's got clearer. Feel free to ask more. Thanks, Alexey > >> >>> from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction >>> with the new capability it is all rather pointless. >>> >>> The scope you've defined for this CAP_SYS_PERFMON is very small. >>> Is there a larger set of privilege checks that might be applicable >>> for it? >> CAP_SYS_PERFMON could be applied broadly, though, this patch set enables record >> and stat mode use cases for system wide performance monitoring in kernel and >> user modes. > > The granularity of capabilities is something we have to watch > very carefully. Sure, CAP_SYS_ADMIN covers a lot of things, but > if we broke it up "properly" we'd have hundreds of capabilities. > If you want control that finely we have SELinux. > >> >> Thanks, >> Alexey >> >>> >>> >>>> CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to >>>> performance monitoring functionality of perf_events and balance amount of >>>> CAP_SYS_ADMIN credentials in accordance with the recommendations provided in >>>> the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded; >>>> see Notes to kernel developers, below." >>>> >>>> For backward compatibility reasons performance monitoring functionality of >>>> perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for >>>> secure performance monitoring use cases is discouraged with respect to the >>>> introduced CAP_SYS_PERFMON capability. >>>> >>>> In the suggested implementation CAP_SYS_PERFMON enables Perf privileged users >>>> [2] to conduct secure performance monitoring using perf_events in the scope >>>> of available online CPUs when executing code in kernel and user modes. >>>> >>>> Possible alternative solution to this capabilities balancing, system security >>>> hardening task could be to use the existing CAP_SYS_PTRACE capability to govern >>>> perf_events' performance monitoring functionality, since process debugging is >>>> similar to performance monitoring with respect to providing insights into >>>> process memory and execution details. However CAP_SYS_PTRACE still provides >>>> users with more credentials than are required for secure performance monitoring >>>> using perf_events subsystem and this excess is avoided by using the dedicated >>>> CAP_SYS_PERFMON capability. >>>> >>>> libcap library utilities [4], [5] and Perf tool can be used to apply >>>> CAP_SYS_PERFMON capability for secure performance monitoring beyond the scope >>>> permitted by system wide perf_event_paranoid kernel setting and below are the >>>> steps to evaluate the advancement suggested by the patch set: >>>> >>>> - patch, build and boot the kernel >>>> - patch, build Perf tool e.g. to /home/user/perf >>>> ... >>>> # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap >>>> # pushd libcap >>>> # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1/3] >>>> # make >>>> # pushd progs >>>> # ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf >>>> # ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf >>>> /home/user/perf: OK >>>> # ./getcap /home/user/perf >>>> /home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep >>>> # echo 2 > /proc/sys/kernel/perf_event_paranoid >>>> # cat /proc/sys/kernel/perf_event_paranoid >>>> 2 >>>> ... >>>> $ /home/user/perf top >>>> ... works as expected ... >>>> $ cat /proc/`pidof perf`/status >>>> Name: perf >>>> Umask: 0002 >>>> State: S (sleeping) >>>> Tgid: 2958 >>>> Ngid: 0 >>>> Pid: 2958 >>>> PPid: 9847 >>>> TracerPid: 0 >>>> Uid: 500 500 500 500 >>>> Gid: 500 500 500 500 >>>> FDSize: 256 >>>> ... >>>> CapInh: 0000000000000000 >>>> CapPrm: 0000004400080000 >>>> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000 >>>> cap_sys_perfmon,cap_sys_ptrace,cap_syslog >>>> CapBnd: 0000007fffffffff >>>> CapAmb: 0000000000000000 >>>> NoNewPrivs: 0 >>>> Seccomp: 0 >>>> Speculation_Store_Bypass: thread vulnerable >>>> Cpus_allowed: ff >>>> Cpus_allowed_list: 0-7 >>>> ... >>>> >>>> Usage of cap_sys_perfmon effectively avoids unused credentials excess: >>>> - with cap_sys_admin: >>>> CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111 >>>> - with cap_sys_perfmon: >>>> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000 >>>> 38 34 19 >>>> sys_perfmon syslog sys_ptrace >>>> >>>> The patch set is for tip perf/core repository: >>>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core >>>> tip sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6 >>>> >>>> [1] http://man7.org/linux/man-pages/man2/perf_event_open.2.html >>>> [2] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html >>>> [3] http://man7.org/linux/man-pages/man7/capabilities.7.html >>>> [4] http://man7.org/linux/man-pages/man8/setcap.8.html >>>> [5] https://git.kernel.org/pub/scm/libs/libcap/libcap.git >>>> [6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf >>>> >>>> --- >>>> Alexey Budankov (3): >>>> capabilities: introduce CAP_SYS_PERFMON to kernel and user space >>>> perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring >>>> perf tool: extend Perf tool with CAP_SYS_PERFMON support >>>> >>>> include/linux/perf_event.h | 6 ++++-- >>>> include/uapi/linux/capability.h | 10 +++++++++- >>>> security/selinux/include/classmap.h | 4 ++-- >>>> tools/perf/design.txt | 3 ++- >>>> tools/perf/util/cap.h | 4 ++++ >>>> tools/perf/util/evsel.c | 10 +++++----- >>>> tools/perf/util/util.c | 15 +++++++++++++-- >>>> 7 files changed, 39 insertions(+), 13 deletions(-) >>>> >>> > >
On 05.12.2019 20:33, Casey Schaufler wrote: > On 12/5/2019 9:05 AM, Alexey Budankov wrote: >> Hello Casey, >> >> On 05.12.2019 19:49, Casey Schaufler wrote: >>> On 12/5/2019 8:15 AM, Alexey Budankov wrote: >>>> Currently access to perf_events functionality [1] beyond the scope permitted >>>> by perf_event_paranoid [1] kernel setting is allowed to a privileged process >>>> [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3]. >>>> >>>> This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance >>>> monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its >>>> governing role for perf_events based performance monitoring of a system. >>>> >>>> CAP_SYS_PERFMON aims to harden system security and integrity when monitoring >>>> performance using perf_events subsystem by processes and Perf privileged users >>>> [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN >>>> privileged processes [3]. >>> Are there use cases where you would need CAP_SYS_PERFMON where you >>> would not also need CAP_SYS_ADMIN? If you separate a new capability >> Actually, there are. Perf tool that has record, stat and top modes could run with >> CAP_SYS_PERFMON capability as mentioned below and provide system wide performance >> data. Currently for that to work the tool needs to be granted with CAP_SYS_ADMIN. > > The question isn't whether the tool could use the capability, it's whether > the tool would also need CAP_SYS_ADMIN to be useful. Are there existing > tools that could stop using CAP_SYS_ADMIN in favor of CAP_SYS_PERFMON? > My bet is that any tool that does performance monitoring is going to need > CAP_SYS_ADMIN for other reasons. > >> >>> from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction >>> with the new capability it is all rather pointless. >>> >>> The scope you've defined for this CAP_SYS_PERFMON is very small. >>> Is there a larger set of privilege checks that might be applicable >>> for it? >> CAP_SYS_PERFMON could be applied broadly, though, this patch set enables record >> and stat mode use cases for system wide performance monitoring in kernel and >> user modes. > > The granularity of capabilities is something we have to watch > very carefully. Sure, CAP_SYS_ADMIN covers a lot of things, but > if we broke it up "properly" we'd have hundreds of capabilities. Fully agree and this broader discussion is really helpful to come up with properly balanced solution. > If you want control that finely we have SELinux. Undoubtedly, SELinux is the powerful, mature, whole level of functionality that could provide benefits not only for perf_events subsystem. However perf_events is built around capabilities to provide access control to its functionality, thus perf_events would require considerable rework prior it could be controlled thru SELinux. Then the adoption could also require changes to the installed infrastructure just for the sake of adopting alternative access control mechanism. On the other hand there are currently already existing users and use cases that are built around the CAP_SYS_ADMIN based access control, and Perf tool, which is the native Linux kernel observability and performance profiling tool, provides means to operate in restricted multiuser environments(HPC clusters, cloud and virtual environments) for groups of unprivileged users under admins control [1]. In this circumstances CAP_SYS_PERFMON looks like smart balanced advancement that trade-offs between perf_events subsystem extensions, required level of control and configurability of perf_events, existing users adoption effort, and it brings security hardening benefits of decreasing attack surface for the existing users and use cases. Well, yes, it is really good that Linux nowadays provides a handful of various security assuring mechanisms but proper balance is what usually makes valuable features happen and its users happy and moves forward. Gratefully, Alexey [1] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
On Wed, Dec 11, 2019 at 01:52:15PM +0300, Alexey Budankov wrote: > Undoubtedly, SELinux is the powerful, mature, whole level of functionality that > could provide benefits not only for perf_events subsystem. However perf_events > is built around capabilities to provide access control to its functionality, > thus perf_events would require considerable rework prior it could be controlled > thru SELinux. You mean this: da97e18458fb ("perf_event: Add support for LSM and SELinux checks") ? > Then the adoption could also require changes to the installed > infrastructure just for the sake of adopting alternative access control mechanism. This is still very much true.
On 11.12.2019 18:24, Peter Zijlstra wrote: > On Wed, Dec 11, 2019 at 01:52:15PM +0300, Alexey Budankov wrote: >> Undoubtedly, SELinux is the powerful, mature, whole level of functionality that >> could provide benefits not only for perf_events subsystem. However perf_events >> is built around capabilities to provide access control to its functionality, >> thus perf_events would require considerable rework prior it could be controlled >> thru SELinux. > > You mean this: > > da97e18458fb ("perf_event: Add support for LSM and SELinux checks") > > ? Yes, I do. This feature greatly adds up into MAC access control [1], [2] for perf_events, additionally to already existing DAC [3]. However, there is still the whole other part of MAC story on the user space side. Fortunately MAC and DAC access control mechanisms designed in the way they are naturally layered and coexist in the system so I don't see any contradiction in advancing either mechanism to meet the demand of possible diverse use cases. There is no much rationale in providing favor to one or the other mechanism because together they constitute complete integrity of security access control and configurability for diverse use cases of perf_events. > >> Then the adoption could also require changes to the installed >> infrastructure just for the sake of adopting alternative access control mechanism. > > This is still very much true. It is just enough to imaging some HPC cluster or Cloud lab with several hundreds of nodes to be upgraded. Thanks, Alexey [1] https://en.wikipedia.org/wiki/Security-Enhanced_Linux [2] https://en.wikipedia.org/wiki/Mandatory_access_control [3] https://en.wikipedia.org/wiki/Discretionary_access_control
On 12/11/2019 2:52 AM, Alexey Budankov wrote: > On 05.12.2019 20:33, Casey Schaufler wrote: >> On 12/5/2019 9:05 AM, Alexey Budankov wrote: >>> Hello Casey, >>> >>> On 05.12.2019 19:49, Casey Schaufler wrote: >>>> On 12/5/2019 8:15 AM, Alexey Budankov wrote: >>>>> Currently access to perf_events functionality [1] beyond the scope permitted >>>>> by perf_event_paranoid [1] kernel setting is allowed to a privileged process >>>>> [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3]. >>>>> >>>>> This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance >>>>> monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its >>>>> governing role for perf_events based performance monitoring of a system. >>>>> >>>>> CAP_SYS_PERFMON aims to harden system security and integrity when monitoring >>>>> performance using perf_events subsystem by processes and Perf privileged users >>>>> [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN >>>>> privileged processes [3]. >>>> Are there use cases where you would need CAP_SYS_PERFMON where you >>>> would not also need CAP_SYS_ADMIN? If you separate a new capability >>> Actually, there are. Perf tool that has record, stat and top modes could run with >>> CAP_SYS_PERFMON capability as mentioned below and provide system wide performance >>> data. Currently for that to work the tool needs to be granted with CAP_SYS_ADMIN. >> The question isn't whether the tool could use the capability, it's whether >> the tool would also need CAP_SYS_ADMIN to be useful. Are there existing >> tools that could stop using CAP_SYS_ADMIN in favor of CAP_SYS_PERFMON? >> My bet is that any tool that does performance monitoring is going to need >> CAP_SYS_ADMIN for other reasons. >> >>>> from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction >>>> with the new capability it is all rather pointless. >>>> >>>> The scope you've defined for this CAP_SYS_PERFMON is very small. >>>> Is there a larger set of privilege checks that might be applicable >>>> for it? >>> CAP_SYS_PERFMON could be applied broadly, though, this patch set enables record >>> and stat mode use cases for system wide performance monitoring in kernel and >>> user modes. >> The granularity of capabilities is something we have to watch >> very carefully. Sure, CAP_SYS_ADMIN covers a lot of things, but >> if we broke it up "properly" we'd have hundreds of capabilities. > Fully agree and this broader discussion is really helpful to come up with > properly balanced solution. > >> If you want control that finely we have SELinux. > Undoubtedly, SELinux is the powerful, mature, whole level of functionality that > could provide benefits not only for perf_events subsystem. However perf_events > is built around capabilities to provide access control to its functionality, > thus perf_events would require considerable rework prior it could be controlled > thru SELinux. Then the adoption could also require changes to the installed > infrastructure just for the sake of adopting alternative access control mechanism. > > On the other hand there are currently already existing users and use cases that > are built around the CAP_SYS_ADMIN based access control, and Perf tool, which is > the native Linux kernel observability and performance profiling tool, provides > means to operate in restricted multiuser environments(HPC clusters, cloud and > virtual environments) for groups of unprivileged users under admins control [1]. > > In this circumstances CAP_SYS_PERFMON looks like smart balanced advancement that > trade-offs between perf_events subsystem extensions, required level of control > and configurability of perf_events, existing users adoption effort, and it brings > security hardening benefits of decreasing attack surface for the existing users > and use cases. I'm not 100% opposed to CAP_SYS_PERFMON. I am 100% opposed to new capabilities that have a single use. Surely there are other CAP_SYS_ADMIN users that [cs]ould be converted to CAP_SYS_PERFMON as well. If there is a class of system performance privileged operations, say a dozen or so, you may have a viable argument. > > Well, yes, it is really good that Linux nowadays provides a handful of various > security assuring mechanisms but proper balance is what usually makes valuable > features happen and its users happy and moves forward. > > Gratefully, > Alexey > > [1] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
On Thu, Dec 5, 2019 at 9:35 AM Casey Schaufler <casey@schaufler-ca.com> wrote: > > On 12/5/2019 9:05 AM, Alexey Budankov wrote: > > Hello Casey, > > > > On 05.12.2019 19:49, Casey Schaufler wrote: > >> On 12/5/2019 8:15 AM, Alexey Budankov wrote: > >>> Currently access to perf_events functionality [1] beyond the scope permitted > >>> by perf_event_paranoid [1] kernel setting is allowed to a privileged process > >>> [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3]. > >>> > >>> This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance > >>> monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its > >>> governing role for perf_events based performance monitoring of a system. > >>> > >>> CAP_SYS_PERFMON aims to harden system security and integrity when monitoring > >>> performance using perf_events subsystem by processes and Perf privileged users > >>> [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN > >>> privileged processes [3]. > >> Are there use cases where you would need CAP_SYS_PERFMON where you > >> would not also need CAP_SYS_ADMIN? If you separate a new capability > > Actually, there are. Perf tool that has record, stat and top modes could run with > > CAP_SYS_PERFMON capability as mentioned below and provide system wide performance > > data. Currently for that to work the tool needs to be granted with CAP_SYS_ADMIN. > > The question isn't whether the tool could use the capability, it's whether > the tool would also need CAP_SYS_ADMIN to be useful. Are there existing > tools that could stop using CAP_SYS_ADMIN in favor of CAP_SYS_PERFMON? The answer is yes. I have recently been alerted to a problem with paranoid=2 and the popular rr debugger (https://rr-project.org/). This debugger uses several perf_events features, including profiling of PMU events and tracepoints (context-switches). With paranoid=2, it does not work anymore. We would need a privilege between regular user and admin to make it work again. Note that context switches tracepoint is only applied to self (not system-wide). > My bet is that any tool that does performance monitoring is going to need > CAP_SYS_ADMIN for other reasons. > > > > >> from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction > >> with the new capability it is all rather pointless. > >> > >> The scope you've defined for this CAP_SYS_PERFMON is very small. > >> Is there a larger set of privilege checks that might be applicable > >> for it? > > CAP_SYS_PERFMON could be applied broadly, though, this patch set enables record > > and stat mode use cases for system wide performance monitoring in kernel and > > user modes. > > The granularity of capabilities is something we have to watch > very carefully. Sure, CAP_SYS_ADMIN covers a lot of things, but > if we broke it up "properly" we'd have hundreds of capabilities. > If you want control that finely we have SELinux. > > > > > Thanks, > > Alexey > > > >> > >> > >>> CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to > >>> performance monitoring functionality of perf_events and balance amount of > >>> CAP_SYS_ADMIN credentials in accordance with the recommendations provided in > >>> the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded; > >>> see Notes to kernel developers, below." > >>> > >>> For backward compatibility reasons performance monitoring functionality of > >>> perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for > >>> secure performance monitoring use cases is discouraged with respect to the > >>> introduced CAP_SYS_PERFMON capability. > >>> > >>> In the suggested implementation CAP_SYS_PERFMON enables Perf privileged users > >>> [2] to conduct secure performance monitoring using perf_events in the scope > >>> of available online CPUs when executing code in kernel and user modes. > >>> > >>> Possible alternative solution to this capabilities balancing, system security > >>> hardening task could be to use the existing CAP_SYS_PTRACE capability to govern > >>> perf_events' performance monitoring functionality, since process debugging is > >>> similar to performance monitoring with respect to providing insights into > >>> process memory and execution details. However CAP_SYS_PTRACE still provides > >>> users with more credentials than are required for secure performance monitoring > >>> using perf_events subsystem and this excess is avoided by using the dedicated > >>> CAP_SYS_PERFMON capability. > >>> > >>> libcap library utilities [4], [5] and Perf tool can be used to apply > >>> CAP_SYS_PERFMON capability for secure performance monitoring beyond the scope > >>> permitted by system wide perf_event_paranoid kernel setting and below are the > >>> steps to evaluate the advancement suggested by the patch set: > >>> > >>> - patch, build and boot the kernel > >>> - patch, build Perf tool e.g. to /home/user/perf > >>> ... > >>> # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap > >>> # pushd libcap > >>> # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1/3] > >>> # make > >>> # pushd progs > >>> # ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf > >>> # ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf > >>> /home/user/perf: OK > >>> # ./getcap /home/user/perf > >>> /home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep > >>> # echo 2 > /proc/sys/kernel/perf_event_paranoid > >>> # cat /proc/sys/kernel/perf_event_paranoid > >>> 2 > >>> ... > >>> $ /home/user/perf top > >>> ... works as expected ... > >>> $ cat /proc/`pidof perf`/status > >>> Name: perf > >>> Umask: 0002 > >>> State: S (sleeping) > >>> Tgid: 2958 > >>> Ngid: 0 > >>> Pid: 2958 > >>> PPid: 9847 > >>> TracerPid: 0 > >>> Uid: 500 500 500 500 > >>> Gid: 500 500 500 500 > >>> FDSize: 256 > >>> ... > >>> CapInh: 0000000000000000 > >>> CapPrm: 0000004400080000 > >>> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000 > >>> cap_sys_perfmon,cap_sys_ptrace,cap_syslog > >>> CapBnd: 0000007fffffffff > >>> CapAmb: 0000000000000000 > >>> NoNewPrivs: 0 > >>> Seccomp: 0 > >>> Speculation_Store_Bypass: thread vulnerable > >>> Cpus_allowed: ff > >>> Cpus_allowed_list: 0-7 > >>> ... > >>> > >>> Usage of cap_sys_perfmon effectively avoids unused credentials excess: > >>> - with cap_sys_admin: > >>> CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111 > >>> - with cap_sys_perfmon: > >>> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000 > >>> 38 34 19 > >>> sys_perfmon syslog sys_ptrace > >>> > >>> The patch set is for tip perf/core repository: > >>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core > >>> tip sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6 > >>> > >>> [1] http://man7.org/linux/man-pages/man2/perf_event_open.2.html > >>> [2] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html > >>> [3] http://man7.org/linux/man-pages/man7/capabilities.7.html > >>> [4] http://man7.org/linux/man-pages/man8/setcap.8.html > >>> [5] https://git.kernel.org/pub/scm/libs/libcap/libcap.git > >>> [6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf > >>> > >>> --- > >>> Alexey Budankov (3): > >>> capabilities: introduce CAP_SYS_PERFMON to kernel and user space > >>> perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring > >>> perf tool: extend Perf tool with CAP_SYS_PERFMON support > >>> > >>> include/linux/perf_event.h | 6 ++++-- > >>> include/uapi/linux/capability.h | 10 +++++++++- > >>> security/selinux/include/classmap.h | 4 ++-- > >>> tools/perf/design.txt | 3 ++- > >>> tools/perf/util/cap.h | 4 ++++ > >>> tools/perf/util/evsel.c | 10 +++++----- > >>> tools/perf/util/util.c | 15 +++++++++++++-- > >>> 7 files changed, 39 insertions(+), 13 deletions(-) > >>> > >> >
> > In this circumstances CAP_SYS_PERFMON looks like smart balanced advancement that > > trade-offs between perf_events subsystem extensions, required level of control > > and configurability of perf_events, existing users adoption effort, and it brings > > security hardening benefits of decreasing attack surface for the existing users > > and use cases. > > I'm not 100% opposed to CAP_SYS_PERFMON. I am 100% opposed to new capabilities > that have a single use. Surely there are other CAP_SYS_ADMIN users that [cs]ould > be converted to CAP_SYS_PERFMON as well. If there is a class of system performance > privileged operations, say a dozen or so, you may have a viable argument. perf events is not a single use. It has a bazillion of sub functionalities, including hardware tracing, software tracing, pmu counters, software counters, uncore counters, break points and various other stuff in its PMU drivers. See it more as a whole quite heterogenous driver subsystem. I guess CAP_SYS_PERFMON is not a good name because perf is much more than just Perfmon. Perhaps call it CAP_SYS_PERF_EVENTS -Andi
On 12/11/2019 12:36 PM, Andi Kleen wrote: >>> In this circumstances CAP_SYS_PERFMON looks like smart balanced advancement that >>> trade-offs between perf_events subsystem extensions, required level of control >>> and configurability of perf_events, existing users adoption effort, and it brings >>> security hardening benefits of decreasing attack surface for the existing users >>> and use cases. >> I'm not 100% opposed to CAP_SYS_PERFMON. I am 100% opposed to new capabilities >> that have a single use. Surely there are other CAP_SYS_ADMIN users that [cs]ould >> be converted to CAP_SYS_PERFMON as well. If there is a class of system performance >> privileged operations, say a dozen or so, you may have a viable argument. > perf events is not a single use. If it is only being called in two places, it is single use. > It has a bazillion of sub functionalities, > including hardware tracing, software tracing, pmu counters, software counters, > uncore counters, break points and various other stuff in its PMU drivers. > > See it more as a whole quite heterogenous driver subsystem. > > I guess CAP_SYS_PERFMON is not a good name because perf is much more > than just Perfmon. Perhaps call it CAP_SYS_PERF_EVENTS > > -Andi
On 12/11/19 3:36 PM, Andi Kleen wrote: >>> In this circumstances CAP_SYS_PERFMON looks like smart balanced advancement that >>> trade-offs between perf_events subsystem extensions, required level of control >>> and configurability of perf_events, existing users adoption effort, and it brings >>> security hardening benefits of decreasing attack surface for the existing users >>> and use cases. >> >> I'm not 100% opposed to CAP_SYS_PERFMON. I am 100% opposed to new capabilities >> that have a single use. Surely there are other CAP_SYS_ADMIN users that [cs]ould >> be converted to CAP_SYS_PERFMON as well. If there is a class of system performance >> privileged operations, say a dozen or so, you may have a viable argument. > > perf events is not a single use. It has a bazillion of sub functionalities, > including hardware tracing, software tracing, pmu counters, software counters, > uncore counters, break points and various other stuff in its PMU drivers. > > See it more as a whole quite heterogenous driver subsystem. > > I guess CAP_SYS_PERFMON is not a good name because perf is much more > than just Perfmon. Perhaps call it CAP_SYS_PERF_EVENTS That seems misleading since it isn't being checked for all perf_events operations IIUC (CAP_SYS_ADMIN is still required for some?) and it is even more specialized than CAP_SYS_PERFMON, making it less likely that we could ever use this capability as a check for other kernel performance monitoring facilities beyond perf_events. I'm not as opposed to fine-grained capabilities as Casey is but I do recognize that there are a limited number of available bits (although we do have a fair number of unused ones currently given the extension to 64-bits) and that it would be easy to consume them all if we allocated one for every kernel feature. That said, this might be a sufficiently important use case to justify it. Obviously I'd encourage you to consider leveraging SELinux as well but I understand that you are looking for a solution that doesn't depend on a distro using a particular LSM or a particular policy. I will note that SELinux doesn't suffer from the limited bits problem because one can always define a new SELinux security class with its own access vector permissions bitmap, as has been done for the recently added LSM/SELinux perf_event hooks. I don't know who actually gets to decide when/if a new capability is allocated. Maybe Serge and/or James as capabilities and LSM maintainers. I have no objections to these patches from a SELinux POV.
On 12.12.2019 17:24, Stephen Smalley wrote: > On 12/11/19 3:36 PM, Andi Kleen wrote: >>>> In this circumstances CAP_SYS_PERFMON looks like smart balanced advancement that >>>> trade-offs between perf_events subsystem extensions, required level of control >>>> and configurability of perf_events, existing users adoption effort, and it brings >>>> security hardening benefits of decreasing attack surface for the existing users >>>> and use cases. >>> >>> I'm not 100% opposed to CAP_SYS_PERFMON. I am 100% opposed to new capabilities >>> that have a single use. Surely there are other CAP_SYS_ADMIN users that [cs]ould >>> be converted to CAP_SYS_PERFMON as well. If there is a class of system performance >>> privileged operations, say a dozen or so, you may have a viable argument. >> >> perf events is not a single use. It has a bazillion of sub functionalities, >> including hardware tracing, software tracing, pmu counters, software counters, >> uncore counters, break points and various other stuff in its PMU drivers. >> >> See it more as a whole quite heterogenous driver subsystem. >> >> I guess CAP_SYS_PERFMON is not a good name because perf is much more >> than just Perfmon. Perhaps call it CAP_SYS_PERF_EVENTS > > That seems misleading since it isn't being checked for all perf_events operations IIUC (CAP_SYS_ADMIN is still required for some?) and it is even more specialized than CAP_SYS_PERFMON, making it less likely that we could ever use this capability as a check for other kernel performance monitoring facilities beyond perf_events. > > I'm not as opposed to fine-grained capabilities as Casey is but I do recognize that there are a limited number of available bits (although we do have a fair number of unused ones currently given the extension to 64-bits) and that it would be easy to consume them all if we allocated one for every kernel feature. That said, this might be a sufficiently important use case to justify it. > > Obviously I'd encourage you to consider leveraging SELinux as well but I understand that you are looking for a solution that doesn't depend on a distro using a particular LSM or a particular policy. I will note that SELinux doesn't suffer from the limited bits problem because one can always define a new SELinux security class with its own access vector permissions bitmap, as has been done for the recently added LSM/SELinux perf_event hooks. > > I don't know who actually gets to decide when/if a new capability is allocated. Maybe Serge and/or James as capabilities and LSM maintainers. > > I have no objections to these patches from a SELinux POV. Stephen, thanks for meaningful input! ~Alexey