Multi-machine recording

This document describes how to record a single Perfetto trace that captures events from two Linux machines simultaneously. It uses traced_relay on the second machine to forward producer IPC to a traced running on the first machine.

For background on what multi-machine tracing is and how it works under the hood, see Multi-machine architecture.

Use case

You have a workload split across two Linux machines — e.g. a client on machine A driving a server on machine B, or a host running a Linux VM — and you want a single trace covering both, so cross-machine causality is visible in one timeline and queryable in one trace file.

In the rest of this guide, host is the machine that will run traced and own the trace buffers, and guest is the second machine whose producers feed into the same trace via traced_relay. Substitute <host-ip> with the IP address (or hostname) of host as reachable from guest.

Prerequisites

NOTE: This guide records ftrace events for the example, which on Linux typically requires running the producer commands as root (or with CAP_SYS_ADMIN). The IPC commands themselves do not require root.

Usage

Step 1: Start traced on the host, listening on TCP

On host:

PERFETTO_PRODUCER_SOCK_NAME=0.0.0.0:20001 \ tracebox traced --enable-relay-endpoint

PERFETTO_PRODUCER_SOCK_NAME rebinds the producer socket from the default UNIX path to a TCP listener that remote machines can reach. --enable-relay-endpoint makes that socket accept traced_relay connections in addition to ordinary local producers.

Leave this process running.

Step 2: Start traced_probes on the host

In a second shell on host:

PERFETTO_PRODUCER_SOCK_NAME=127.0.0.1:20001 \ sudo -E tracebox traced_probes

The same env var that rebound traced's listener also tells local producers where to connect — without it, traced_probes would still try the default UNIX socket and fail. sudo -E preserves the env var across the privilege escalation needed for ftrace.

Step 3: Start traced_relay on the guest

On guest:

PERFETTO_RELAY_SOCK_NAME=<host-ip>:20001 \ tracebox traced_relay

traced_relay opens the standard local producer socket on guest and forwards every producer IPC frame to the host's relay endpoint. You should see a startup line of the form:

Started traced_relay, listening on /tmp/perfetto-producer, forwarding to <host-ip>:20001

(The listening path may instead be /run/perfetto/traced-producer.sock if that directory exists — both are valid Linux defaults.)

Leave this process running.

Step 4: Start traced_probes on the guest

In a second shell on guest:

sudo tracebox traced_probes

No env var is needed: with PERFETTO_PRODUCER_SOCK_NAME unset, traced_probes connects to the default Linux producer socket — which is exactly the path traced_relay is listening on — so the two find each other automatically.

Step 5: Record a trace from the host

Multi-machine tracing requires an explicit TraceConfig — the tracebox perfetto -t 10s ... sched/sched_switch shorthand records on the host machine only (see Multi-machine architecture).

On host, write a config file:

cat > config.pbtx <<'EOF' buffers { size_kb: 32768 fill_policy: RING_BUFFER } trace_all_machines: true data_sources { config { name: "linux.ftrace" ftrace_config { ftrace_events: "sched/sched_switch" } } } duration_ms: 10000 EOF

Then record:

tracebox perfetto --txt -c config.pbtx -o trace.pftrace

Step 6: Verify both machines are in the trace

Open trace.pftrace at https://ui.perfetto.dev. In the SQL query view, run:

SELECT id, raw_id, sysname, release, arch, num_cpus FROM machine;

Expect two rows. id = 0 is always the host; remote machines have a non-zero raw_id. See the machine table reference for the full set of columns.

To confirm that events from both machines made it into the trace, group ftrace events by machine. ftrace_event does not carry machine_id directly — each row references a cpu (via ucpu), and cpu carries the machine_id:

SELECT cpu.machine_id, COUNT(*) AS num_events FROM ftrace_event JOIN cpu USING (ucpu) GROUP BY cpu.machine_id;

You should see one row per machine, each with a non-zero count. The same join pattern works against the thread or process tables to slice by machine through different dimensions.

Troubleshooting

Next steps