[v2,1/4] perf-security: document perf_events/Perf resource control
diff mbox series

Message ID a108c800-d6c1-fac1-b73b-e390e579e6ef@linux.intel.com
State New
Headers show
Series
  • admin-guide: extend perf-security with resource control, data categories and privileged users
Related show

Commit Message

Alexey Budankov Feb. 7, 2019, 1:29 p.m. UTC
Extend perf-security.rst file with perf_events/Perf resource control
section describing RLIMIT_NOFILE and perf_event_mlock_kb settings for
performance monitoring user processes.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
Changes in v2:
- applied comments on v1

---
 Documentation/admin-guide/perf-security.rst | 36 +++++++++++++++++++++
 1 file changed, 36 insertions(+)

Comments

Thomas Gleixner Feb. 10, 2019, 10:34 p.m. UTC | #1
On Thu, 7 Feb 2019, Alexey Budankov wrote:

General note: Please stay in the 80 char limit for all of the text.

> +The perf_events system call API [2]_ allocates file descriptors for every configured
> +PMU event. Open file descriptors are a per-process accountable resource governed
> +by the RLIMIT_NOFILE [11]_ limit (ulimit -n), which is usually derived from the login
> +shell process. When configuring Perf collection for a long list of events on a
> +large server system, this limit can be easily hit preventing required monitoring
> +configuration.

I'd move this sentence into a different paragraph and keep those related to
RLIMIT_NOFILE together.

> ... RLIMIT_NOFILE limit can be increased on per-user basis modifying
> +content of the limits.conf file [12]_ on some systems.

On some systems?

> Ordinarily, a Perf sampling session
> +(perf record) requires an amount of open perf_event file descriptors that is not
> +less than a number of monitored events multiplied by a number of monitored CPUs.

  s/a number of/the number of/

The ordinary use case is:

    perf CMD pile-of-events PROCESS

which does not specify the monitored CPUs at all. Then the number of file
descriptors is NR_EVENTS * NR_ONLINE_CPUS.

> +An amount of memory available to user processes for capturing performance monitoring

The amount ...

> +data is governed by the perf_event_mlock_kb [2]_ setting. This perf_event specific
> +resource setting defines overall per-cpu limits of memory allowed for mapping
> +by the user processes to execute performance monitoring. The setting essentially
> +extends the RLIMIT_MEMLOCK [11]_ limit, but only for memory regions mapped specially

s/specially/specifically/

> +for capturing monitored performance events and related data.
> +
> +For example, if a machine has eight cores and perf_event_mlock_kb limit is set
> +to 516 KiB, then a user process is provided with 516 KiB * 8 = 4128 KiB of memory
> +above the RLIMIT_MEMLOCK limit (ulimit -l) for perf_event mmap buffers. In particular,
> +this means that, if the user wants to start two or more performance monitoring
> +processes, the user is required to manually distribute available 4128 KiB between the

distribute the available

> +monitoring processes, for example, using the --mmap-pages Perf record mode option.
> +Otherwise, the first started performance monitoring process allocates all available
> +4128 KiB and the other processes will fail to proceed due to the lack of memory.
> +
> +RLIMIT_MEMLOCK and perf_event_mlock_kb resource costraints are ignored for

constraints.

> +processes with the CAP_IPC_LOCK capability. Thus, perf_events/Perf privileged users

what means perf_events/Perf ?

> +can be provided with memory above the constraints for perf_events/Perf performance
> +monitoring purpose by providing the Perf executable with CAP_IPC_LOCK capability.

Thanks,

	tglx
Alexey Budankov Feb. 11, 2019, 12:46 p.m. UTC | #2
On 11.02.2019 1:34, Thomas Gleixner wrote:
> On Thu, 7 Feb 2019, Alexey Budankov wrote:
> 
> General note: Please stay in the 80 char limit for all of the text.

Yes, sure. [PATCH v2 4/4] implements wrapping at 72 columns.

> 
>> +The perf_events system call API [2]_ allocates file descriptors for every configured
>> +PMU event. Open file descriptors are a per-process accountable resource governed
>> +by the RLIMIT_NOFILE [11]_ limit (ulimit -n), which is usually derived from the login
>> +shell process. When configuring Perf collection for a long list of events on a
>> +large server system, this limit can be easily hit preventing required monitoring
>> +configuration.
> 
> I'd move this sentence into a different paragraph and keep those related to
> RLIMIT_NOFILE together.

Makes sense. Let's have these two paragraphs:

Open file descriptors
+++++++++++++++++++++

Memory allocation
+++++++++++++++++

> 
>> ... RLIMIT_NOFILE limit can be increased on per-user basis modifying
>> +content of the limits.conf file [12]_ on some systems.
> 
> On some systems?

Well, let's avoid this subtlety and have it like:

'RLIMIT_NOFILE limit can be increased on per-user basis 
 modifying content of the limits.conf file [12]_ .'

> 
>> Ordinarily, a Perf sampling session
>> +(perf record) requires an amount of open perf_event file descriptors that is not
>> +less than a number of monitored events multiplied by a number of monitored CPUs.
> 
>   s/a number of/the number of/

Accepted.

> 
> The ordinary use case is:
> 
>     perf CMD pile-of-events PROCESS
> 
> which does not specify the monitored CPUs at all. Then the number of file
> descriptors is NR_EVENTS * NR_ONLINE_CPUS.
> 
>> +An amount of memory available to user processes for capturing performance monitoring
> 
> The amount ...

Accepted.

> 
>> +data is governed by the perf_event_mlock_kb [2]_ setting. This perf_event specific
>> +resource setting defines overall per-cpu limits of memory allowed for mapping
>> +by the user processes to execute performance monitoring. The setting essentially
>> +extends the RLIMIT_MEMLOCK [11]_ limit, but only for memory regions mapped specially
> 
> s/specially/specifically/

Accepted.

> 
>> +for capturing monitored performance events and related data.
>> +
>> +For example, if a machine has eight cores and perf_event_mlock_kb limit is set
>> +to 516 KiB, then a user process is provided with 516 KiB * 8 = 4128 KiB of memory
>> +above the RLIMIT_MEMLOCK limit (ulimit -l) for perf_event mmap buffers. In particular,
>> +this means that, if the user wants to start two or more performance monitoring
>> +processes, the user is required to manually distribute available 4128 KiB between the
> 
> distribute the available

Accepted.

> 
>> +monitoring processes, for example, using the --mmap-pages Perf record mode option.
>> +Otherwise, the first started performance monitoring process allocates all available
>> +4128 KiB and the other processes will fail to proceed due to the lack of memory.
>> +
>> +RLIMIT_MEMLOCK and perf_event_mlock_kb resource costraints are ignored for
> 
> constraints.

Accepted.

> 
>> +processes with the CAP_IPC_LOCK capability. Thus, perf_events/Perf privileged users
> 
> what means perf_events/Perf ?

'perf_events/Perf privileged users' refers to the paragraph about privileged users.
'perf_events/Perf' means exact combination of the kernel subsystem (perf_events) and 
the privileged Perf tool (Perf) executable that enables certain group of users with
performance monitoring capabilities without scope limit.

> 
>> +can be provided with memory above the constraints for perf_events/Perf performance
>> +monitoring purpose by providing the Perf executable with CAP_IPC_LOCK capability.
> 
> Thanks,
> 
> 	tglx
> 

Thanks,
Alexey
Thomas Gleixner Feb. 11, 2019, 2:15 p.m. UTC | #3
On Mon, 11 Feb 2019, Alexey Budankov wrote:
> On 11.02.2019 1:34, Thomas Gleixner wrote:
> > On Thu, 7 Feb 2019, Alexey Budankov wrote:
> > 
> > General note: Please stay in the 80 char limit for all of the text.
> 
> Yes, sure. [PATCH v2 4/4] implements wrapping at 72 columns.

So you provide crappy formatted stuff first, just to reformat it at the
end. I'm missing the logic behind that.

Thanks,

	tglx
Alexey Budankov Feb. 11, 2019, 2:22 p.m. UTC | #4
On 11.02.2019 17:15, Thomas Gleixner wrote:
> On Mon, 11 Feb 2019, Alexey Budankov wrote:
>> On 11.02.2019 1:34, Thomas Gleixner wrote:
>>> On Thu, 7 Feb 2019, Alexey Budankov wrote:
>>>
>>> General note: Please stay in the 80 char limit for all of the text.
>>
>> Yes, sure. [PATCH v2 4/4] implements wrapping at 72 columns.
> 
> So you provide crappy formatted stuff first, just to reformat it at the
> end. I'm missing the logic behind that.

The logic is not to mix new content review with the whole doc 
formatting in the end.

Thanks,
Alexey

> 
> Thanks,
> 
> 	tglx
>

Patch
diff mbox series

diff --git a/Documentation/admin-guide/perf-security.rst b/Documentation/admin-guide/perf-security.rst
index f73ebfe9bfe2..3915f07b9dea 100644
--- a/Documentation/admin-guide/perf-security.rst
+++ b/Documentation/admin-guide/perf-security.rst
@@ -84,6 +84,40 @@  governed by perf_event_paranoid [2]_ setting:
      locking limit is imposed but ignored for unprivileged processes with
      CAP_IPC_LOCK capability.
 
+perf_events/Perf resource control
+---------------------------------
+
+The perf_events system call API [2]_ allocates file descriptors for every configured
+PMU event. Open file descriptors are a per-process accountable resource governed
+by the RLIMIT_NOFILE [11]_ limit (ulimit -n), which is usually derived from the login
+shell process. When configuring Perf collection for a long list of events on a
+large server system, this limit can be easily hit preventing required monitoring
+configuration. RLIMIT_NOFILE limit can be increased on per-user basis modifying
+content of the limits.conf file [12]_ on some systems. Ordinarily, a Perf sampling session
+(perf record) requires an amount of open perf_event file descriptors that is not
+less than a number of monitored events multiplied by a number of monitored CPUs.
+
+An amount of memory available to user processes for capturing performance monitoring
+data is governed by the perf_event_mlock_kb [2]_ setting. This perf_event specific
+resource setting defines overall per-cpu limits of memory allowed for mapping
+by the user processes to execute performance monitoring. The setting essentially
+extends the RLIMIT_MEMLOCK [11]_ limit, but only for memory regions mapped specially
+for capturing monitored performance events and related data.
+
+For example, if a machine has eight cores and perf_event_mlock_kb limit is set
+to 516 KiB, then a user process is provided with 516 KiB * 8 = 4128 KiB of memory
+above the RLIMIT_MEMLOCK limit (ulimit -l) for perf_event mmap buffers. In particular,
+this means that, if the user wants to start two or more performance monitoring
+processes, the user is required to manually distribute available 4128 KiB between the
+monitoring processes, for example, using the --mmap-pages Perf record mode option.
+Otherwise, the first started performance monitoring process allocates all available
+4128 KiB and the other processes will fail to proceed due to the lack of memory.
+
+RLIMIT_MEMLOCK and perf_event_mlock_kb resource costraints are ignored for
+processes with the CAP_IPC_LOCK capability. Thus, perf_events/Perf privileged users
+can be provided with memory above the constraints for perf_events/Perf performance
+monitoring purpose by providing the Perf executable with CAP_IPC_LOCK capability.
+
 Bibliography
 ------------
 
@@ -94,4 +128,6 @@  Bibliography
 .. [5] `<https://www.kernel.org/doc/html/latest/security/credentials.html>`_
 .. [6] `<http://man7.org/linux/man-pages/man7/capabilities.7.html>`_
 .. [7] `<http://man7.org/linux/man-pages/man2/ptrace.2.html>`_
+.. [11] `<http://man7.org/linux/man-pages/man2/getrlimit.2.html>`_
+.. [12] `<http://man7.org/linux/man-pages/man5/limits.conf.5.html>`_