Understanding Linux Performance with Perf (with examples)

Linux is a powerful operating system that offers various tools and utilities for performance analysis. One such tool is perf , which is a framework for performing performance counter measurements in Linux. It provides detailed information about the system’s performance, allowing developers and system administrators to identify performance bottlenecks and optimize their applications.

In this article, we will explore several use cases of the perf command, along with code examples, to demonstrate its capabilities. We will cover basic performance counter stats display, system-wide real-time performance counter profiling, recording profiles of commands and existing processes, and reading and displaying the recorded profiles.

Use Case 1: Displaying Basic Performance Counter Stats

The perf stat command allows us to display basic performance counter statistics for a specific command. It gives insights into various aspects of program execution, such as CPU cycles, cache misses, branch instructions, etc.

Code Example:

perf stat gcc hello.c 

Motivation:

Displaying basic performance counter stats can help developers understand the runtime characteristics of their code. It provides valuable information about the efficiency of the program, including potential areas where optimizations can be implemented.

Explanation:

Example Output:

 Performance counter stats for 'gcc hello.c': 22,189.40 msec task-clock # 1.000 CPUs utilized 1,673 context-switches # 0.075 K/sec 24 cpu-migrations # 0.001 K/sec 244 page-faults # 0.011 K/sec 60,071,519,456 cycles # 2.709 GHz 1.436846016 seconds time elapsed 

Use Case 2: System-Wide Real-Time Performance Counter Profile

The perf top command is used to display a system-wide real-time performance counter profile. It provides a dynamic view of the most CPU-consuming functions/system calls and their respective event counts. This helps identify high-impact areas in the system and optimize them accordingly.

Code Example:

sudo perf top 

Motivation:

Monitoring the system-wide performance counters in real-time helps in identifying the most resource-intensive components of the system. It provides a quick overview of the system’s behavior and areas that may require optimization.

Explanation:

Example Output:

 . output truncated . 2.07% httpd [kernel] 1.68% bash [kernel] 1.55% mysqld [kernel] 1.34% perf [kernel] 1.23% sshd [kernel] . output truncated . 

Use Case 3: Recording Command Profiles

The perf record command allows us to run a command and record its performance profile into a file, typically named perf.data . This profile file can later be analyzed to understand the program’s behavior, including CPU usage, function call graph, and event counts.

Code Example:

sudo perf record command 

Motivation:

Recording command profiles can help analyze the performance and behavior of specific commands or applications. It allows developers to collect detailed information about the program’s execution, which can be analyzed later to identify performance bottlenecks and optimize the code.

Explanation:

Example Output:

The output of the perf record command is typically minimal. It shows the progress of the command and the location of the generated perf.data file.

[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data (3612 samples) ] 

Use Case 4: Recording Process Profiles

The perf record command can also be used to record the performance profile of an existing process. This allows us to analyze the behavior and performance characteristics of a running process without having to restart it.

Code Example:

sudo perf record -p pid 

Motivation:

Recording process profiles is useful when analyzing the performance of a long-running process or debugging a specific issue in real-time. It allows us to record and inspect the performance profile without interrupting the process.

Explanation:

Example Output:

The output of the perf record command is typically minimal. It shows the progress of the command and the location of the generated perf.data file.

[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data (3612 samples> ] 

Use Case 5: Reading and Displaying Profiles

Once a performance profile is recorded using perf record , the perf report command can be used to read and display the profile information in an organized manner. It provides insights into function call stacks, event counts, and other details for analyzing the program’s performance.

Code Example:

sudo perf report 

Motivation:

Analyzing the recorded profiles is crucial for identifying performance bottlenecks and optimizing the code. The perf report command provides an easy-to-use interface to explore the recorded information, including the function call graph, event counts, and other important metrics.

Explanation:

Example Output:

The perf report command displays the recorded profile in an interactive interface. It includes information about the function call graph, event counts, and various other statistics for analyzing the program’s performance.

# Overhead Command Shared Object # . . . # 92.28% perf [kernel] 0.42% lsmod [kernel] 0.32% cpup [kernel] 0.29% insmod [kernel]

Conclusion

The perf command provides powerful tools for analyzing and understanding the performance of Linux systems. By utilizing the various features and options of perf , developers can gain valuable insights into their code’s behavior, identify performance bottlenecks, and optimize their applications.