Trace Summarization

This guide explains how to use Perfetto's trace summarization feature to extract structured, actionable data from your traces.

Why Use Trace Summarization?

PerfettoSQL is a powerful tool for interactively exploring traces. You can write any query you want, and the results are immediately available. However, this flexibility presents a challenge for automation and large-scale analysis. The output of a SELECT statement has an arbitrary schema (column names and types), which can change from one query to the next. This makes it difficult to build generic tools, dashboards, or regression-detection systems that consume this data, as they cannot rely on a stable data structure.

Trace summarization solves this problem. It provides a way to define a stable, structured schema for the data you want to extract from a trace. Instead of producing arbitrary tables, it generates a consistent protobuf message (TraceSummary) that is easy for tools to parse and process.

This is especially powerful for cross-trace analysis. By running the same summary specification across hundreds or thousands of traces, you can reliably aggregate the results to track performance metrics over time, compare different versions of your application, and automatically detect regressions.

In short, use trace summarization when you need to:

Using Summaries with the Standard Library

The easiest way to get started is by using the modules in the PerfettoSQL Standard Library.

Let's walk through an example. Suppose we want to compute the average memory usage (specifically, RSS + Swap) for each process in a trace. The linux.memory.process module already provides a table, memory_rss_and_swap_per_process, that is perfect for this.

We can define a TraceSummarySpec to compute this metric:

// spec.textproto metric_spec { id: "memory_per_process" dimensions: "process_name" value: "avg_rss_and_swap" query: { table: { table_name: "memory_rss_and_swap_per_process" module_name: "linux.memory.process" } group_by: { column_names: "process_name" aggregates: { column_name: "rss_and_swap" op: DURATION_WEIGHTED_MEAN result_column_name: "avg_rss_and_swap" } } } }

To run this, save the above content as spec.textproto and use your preferred tool.

from perfetto.trace_processor import TraceProcessor with open('spec.textproto', 'r') as f: spec_text = f.read() with TraceProcessor(trace='my_trace.pftrace') as tp: summary = tp.trace_summary( specs=[spec_text], metric_ids=["memory_per_process"] ) print(summary)

trace_processor_shell --summary \ --summary-spec spec.textproto \ --summary-metrics-v2 memory_per_process \ my_trace.pftrace

Reducing Duplication with Templates

Often, you'll want to compute several related metrics that share the same underlying query and dimensions. For example, for a given process, you might want to know the minimum, maximum, and average memory usage.

Instead of writing a separate metric_spec for each, which would involve repeating the same query and dimensions blocks, you can use a TraceMetricV2TemplateSpec. This is more concise, less error-prone, and more performant as the underlying query is only run once.

Let's extend our memory example to calculate the min, max, and duration-weighted average of RSS+Swap for each process.

// spec.textproto metric_template_spec { id_prefix: "memory_per_process" dimensions: "process_name" value_columns: "min_rss_and_swap" value_columns: "max_rss_and_swap" value_columns: "avg_rss_and_swap" query: { table: { table_name: "memory_rss_and_swap_per_process" module_name: "linux.memory.process" } group_by: { column_names: "process_name" aggregates: { column_name: "rss_and_swap" op: MIN result_column_name: "min_rss_and_swap" } aggregates: { column_name: "rss_and_swap" op: MAX result_column_name: "max_rss_and_swap" } aggregates: { column_name: "rss_and_swap" op: DURATION_WEIGHTED_MEAN result_column_name: "avg_rss_and_swap" } } } }

This single template generates three metrics:

You can then run this, requesting any or all of the generated metrics, as shown below.

from perfetto.trace_processor import TraceProcessor with open('spec.textproto', 'r') as f: spec_text = f.read() with TraceProcessor(trace='my_trace.pftrace') as tp: summary = tp.trace_summary( specs=[spec_text], metric_ids=[ "memory_per_process_min_rss_and_swap", "memory_per_process_max_rss_and_swap", "memory_per_process_avg_rss_and_swap", ] ) print(summary)

trace_processor_shell --summary \ --summary-spec spec.textproto \ --summary-metrics-v2 memory_per_process_min_rss_and_swap,memory_per_process_max_rss_and_swap,memory_per_process_avg_rss_and_swap \ my_trace.pftrace

Using Summaries with Custom SQL Modules

While the standard library is powerful, you will often need to analyze custom events specific to your application. You can achieve this by writing your own SQL modules and loading them into Trace Processor.

A SQL package is simply a directory containing .sql files. This directory can be loaded into Trace Processor, and its files become available as modules.

Let's say you have custom slices named game_frame and you want to calculate the average, minimum, and maximum frame duration.

1. Create your custom SQL module:

Create a directory structure like this:

my_sql_modules/ └── my_game/ └── metrics.sql

Inside metrics.sql, define a view that calculates the frame stats:

-- my_sql_modules/my_game/metrics.sql CREATE PERFETTO VIEW game_frame_stats AS SELECT 'game_frame' AS frame_type, MIN(dur) AS min_duration_ns, MAX(dur) AS max_duration_ns, AVG(dur) AS avg_duration_ns FROM slice WHERE name = 'game_frame' GROUP BY 1;

2. Use a template in your summary spec:

Again, we can use a TraceMetricV2TemplateSpec to generate these related metrics from a single, shared configuration.

Create a spec.textproto that references your custom module and view:

// spec.textproto metric_template_spec { id_prefix: "game_frame" dimensions: "frame_type" value_columns: "min_duration_ns" value_columns: "max_duration_ns" value_columns: "avg_duration_ns" query: { table: { // The module name is the directory path relative to the package root, // with the .sql extension removed. module_name: "my_game.metrics" table_name: "game_frame_stats" } } }

3. Run the summary with your custom package:

You can now compute the summary using either the Python API or the command-line shell, telling Trace Processor where to find your custom package.

Use the add_sql_packages argument in the TraceProcessorConfig.

from perfetto.trace_processor import TraceProcessor, TraceProcessorConfig # Path to your custom SQL modules directory sql_package_path = './my_sql_modules' config = TraceProcessorConfig( add_sql_packages=[sql_package_path] ) with open('spec.textproto', 'r') as f: spec_text = f.read() with TraceProcessor(trace='my_trace.pftrace', config=config) as tp: # Requesting one, some, or all of the generated metrics. summary = tp.trace_summary( specs=[spec_text], metric_ids=[ "game_frame_min_duration_ns", "game_frame_max_duration_ns", "game_frame_avg_duration_ns" ] ) print(summary)

Use the --add-sql-package flag. You can list the metrics explicitly or use the all keyword.

trace_processor_shell --summary \ --add-sql-package ./my_sql_modules \ --summary-spec spec.textproto \ --summary-metrics-v2 game_frame_min_duration_ns,game_frame_max_duration_ns,game_frame_avg_duration_ns \ my_trace.pftrace

Common Patterns and Techniques

Analyzing Time Intervals with interval_intersect

A common analysis pattern is to analyze data from one source (e.g., CPU usage) within specific time windows from another (e.g., a "Critical User Journey" slice). The interval_intersect query makes this easy.

It works by taking a base query and one or more interval queries. The result includes only the rows from the base query that overlap in time with at least one row from each of the interval queries.

Use Cases:

Example: CPU Time during a Specific CUJ Slice

This example demonstrates using interval_intersect to find total CPU time for thread bar within the duration of any "baz_*" slice from the "system_server" process.

// In a metric_spec with id: "bar_cpu_time_during_baz_cujs" query: { interval_intersect: { base: { // The base data is CPU time per thread. table: { table_name: "thread_slice_cpu_time" module_name: "slices.cpu_time" } filters: { column_name: "thread_name" op: EQUAL string_rhs: "bar" } } interval_intersect: { // The intervals are the "baz_*" slices. simple_slices: { slice_name_glob: "baz_*" process_name_glob: "system_server" } } } group_by: { // We sum the CPU time from the intersected intervals. aggregates: { column_name: "cpu_time" op: SUM result_column_name: "total_cpu_time" } } }

Adding Trace-Wide Metadata

You can add key-value metadata to your summary to provide context for the metrics, such as the device model or OS version. This is especially useful when analyzing multiple traces, as it allows you to group or filter results based on this metadata.

The metadata is computed alongside any metrics you request in the same run.

1. Define the metadata query in your spec:

This query must return "key" and "value" columns.

// In spec.textproto, alongside your metric_spec definitions query { id: "device_info_query" sql { sql: "SELECT 'device_name' AS key, 'Pixel Test' AS value" column_names: "key" column_names: "value" } }

2. Run the summary with both metrics and metadata:

When you run the summary, you specify both the metrics you want to compute and the query to use for metadata.

Pass both metric_ids and metadata_query_id:

summary = tp.trace_summary( specs=[spec_text], metric_ids=["game_frame_avg_duration_ns"], metadata_query_id="device_info_query" )

Use both --summary-metrics-v2 and --summary-metadata-query:

trace_processor_shell --summary \\ --summary-spec spec.textproto \\ --summary-metrics-v2 game_frame_avg_duration_ns \\ --summary-metadata-query device_info_query \\ my_trace.pftrace

Output Format

The result of a summary is a TraceSummary protobuf message. This message contains a metric_bundles field, which is a list of TraceMetricV2Bundle messages.

Each bundle can contain the results for one or more metrics that were computed together. Using a TraceMetricV2TemplateSpec is the most common way to create a bundle. All metrics generated from a single template are automatically placed in the same bundle, sharing the same specs and row structure. This is highly efficient as the dimension values, which are often repetitive, are only written once per row.

Example Output

For the memory_per_process template example, the output TraceSummary would contain a TraceMetricV2Bundle like this:

# In TraceSummary's metric_bundles field: metric_bundles { # The specs for all three metrics generated by the template. specs { id: "memory_per_process_min_rss_and_swap" dimensions: "process_name" value: "min_rss_and_swap" # ... query details ... } specs { id: "memory_per_process_max_rss_and_swap" dimensions: "process_name" value: "max_rss_and_swap" # ... query details ... } specs { id: "memory_per_process_avg_rss_and_swap" dimensions: "process_name" value: "avg_rss_and_swap" # ... query details ... } # Each row contains one set of dimensions and three values, corresponding # to the three metrics in `specs`. row { values { double_value: 100000 } # min values { double_value: 200000 } # max values { double_value: 123456.789 } # avg dimension { string_value: "com.example.app" } } row { values { double_value: 80000 } # min values { double_value: 150000 } # max values { double_value: 98765.432 } # avg dimension { string_value: "system_server" } } # ... }

Comparison with the Legacy Metrics System

Perfetto previously had a different system for computing metrics, often referred to as "v1 metrics." Trace summarization is the successor to this system, designed to be more robust and easier to use.

Here are the key differences:

Reference

Running Summaries

You can compute summaries using different Perfetto tools.

For programmatic workflows, use the trace_summary method of the TraceProcessor class.

from perfetto.trace_processor import TraceProcessor # Assume 'tp' is an initialized TraceProcessor instance # and 'spec_text' contains your TraceSummarySpec. summary_proto = tp.trace_summary( specs=[spec_text], metric_ids=["example_metric"], metadata_query_id="device_info_query" ) print(summary_proto)

The trace_summary method takes the following arguments:

  • specs: A list of TraceSummarySpec definitions (as text or bytes).
  • metric_ids: An optional list of metric IDs to compute. If None, all metrics in the specs are computed.
  • metadata_query_id: An optional ID of a query to run for trace-wide metadata.

The trace_processor_shell allows you to compute trace summaries from a trace file using dedicated flags.

  • Run specific metrics by ID: Provide a comma-separated list of metric IDs using the --summary-metrics-v2 flag.trace_processor_shell --summary \\ --summary-spec YOUR_SPEC_FILE \\ --summary-metrics-v2 METRIC_ID_1,METRIC_ID_2 \\ TRACE_FILE
  • Run all metrics defined in the spec: Use the keyword all.trace_processor_shell --summary \\ --summary-spec YOUR_SPEC_FILE \\ --summary-metrics-v2 all \\ TRACE_FILE
  • Output Format: Control the output format with --summary-format.
    • text: Human-readable text protobuf (default).
    • binary: Binary protobuf.

TraceSummarySpec

The top-level message for configuring a summary. It contains:

TraceSummary

The top-level message for the output of a summary. It contains:

TraceMetricV2Spec

Defines a single metric.

TraceMetricV2TemplateSpec

Defines a template for generating multiple, related metrics from a single, shared configuration. This is useful for reducing duplication when you have several metrics that share the same query and dimensions.

Using a template automatically bundles the generated metrics into a single TraceMetricV2Bundle in the output.

TraceMetricV2Bundle

Contains the results for one or more metrics which are bundled together.

PerfettoSqlStructuredQuery

The PerfettoSqlStructuredQuery message provides a structured way to define PerfettoSQL queries. It is built by defining a data source and then optionally applying filters, group_by operations, and select_columns transformations.

Query Sources

A query's source can be one of the following:

Query Operations

These operations are applied sequentially to the data from the source: