Data Explorer Architecture

This document explains how Perfetto's Data Explorer works, from creating visual query graphs to executing SQL queries and displaying results. It covers the key components, data flow, and architectural patterns that enable the Data Explorer to provide an interactive, node-based SQL query builder for trace analysis.

Overview

The Data Explorer is a visual query builder that allows users to construct complex SQL queries by connecting nodes in a directed acyclic graph (DAG). Each node represents either a data source (table, slices, custom SQL) or an operation (filter, aggregation, join, etc.). The system converts this visual graph into structured SQL queries, executes them via the trace processor, and displays results in an interactive data grid.

Core Data Flow

User InteractionNode Graph → Structured Query Generation → Query Analysis (Validation) → Query Materialization → Result Display

Node Graph Structure

QueryNode (ui/src/plugins/dev.perfetto.DataExplorer/query_node.ts:128-161)

Node Connections (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/graph_utils.ts)

Node Registration and Creation

NodeRegistry (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/node_registry.ts)

Core Nodes (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/core_nodes.ts)

registerCoreNodes() { nodeRegistry.register('table', {...}); nodeRegistry.register('slice', {...}); nodeRegistry.register('sql', {...}); nodeRegistry.register('filter', {...}); nodeRegistry.register('aggregation', {...}); // ... more nodes }

Node Types

1. Source Nodes (Data Origin)

TableSourceNode - Queries a specific SQL table SlicesSourceNode - Pre-configured query for trace slices SqlSourceNode - Custom SQL query as data source TimeRangeSourceNode - Generates time intervals

2. Single-Input Modification Nodes

FilterNode - Adds WHERE conditions SortNode - Adds ORDER BY clauses AggregationNode - GROUP BY with aggregate functions ModifyColumnsNode - Renames/removes columns AddColumnsNode - Adds columns from secondary source via LEFT JOIN and/or computed expressions LimitAndOffsetNode - Pagination

3. Multi-Input Nodes

UnionNode - Combines rows from multiple sources JoinNode - Combines columns via JOIN conditions IntervalIntersectNode - Finds overlapping time intervals FilterDuringNode - Filters using secondary interval input CreateSlicesNode - Pairs start/end events from two secondary sources into slices

UI Components

Builder (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/builder.ts)

Graph (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/graph/graph.ts)

NodePanel (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/node_panel.ts)

DataExplorer (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/data_explorer.ts)

Query Execution Model

Two-Phase Execution

Phase 1: Analysis (Validation)

Node Graph → Structured Query Protobuf → Engine.updateSummarizerSpec() + querySummarizer() → Query {sql, textproto, columns} | Error

Phase 2: Materialization (Execution)

engine.querySummarizer(summarizerId, nodeId) → TP creates/reuses table → {tableName, rowCount, columns, durationMs} → SQLDataSource → DataGrid Display

QueryExecutionService

Purpose (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/query_execution_service.ts)

Trace Processor as Single Source of Truth

All materialization state is managed by Trace Processor (TP), not the UI:

The UI queries TP on-demand instead of caching:

// Fetch table name from TP when needed (e.g., for "Copy Table Name" or export) async getTableName(nodeId: string): Promise<string | undefined> { const result = await engine.querySummarizer(DATA_EXPLORER_SUMMARIZER_ID, nodeId); if (result.exists !== true || result.error) { return undefined; } return result.tableName; }

This eliminates state synchronization bugs between UI and TP.

FIFO Execution Queue

Rapid Node Click Handling (ui/src/base/async_limiter.ts)

The AsyncLimiter ensures only the latest queued task runs when clicking nodes rapidly:

// AsyncLimiter behavior: while ((task = taskQueue.shift())) { if (taskQueue.length > 0) { task.deferred.resolve(); // Skip - newer tasks waiting } else { await task.work(); // Run - this is the latest } }

Example: Click A → B → C rapidly while A is processing:

  1. A starts processing
  2. B queued, C queued
  3. A finishes
  4. B skipped (queue has C), C runs

This ensures the currently selected node (C) is processed, intermediate clicks (B) are skipped.

Materialization via TP API

// Sync all queries with TP, then fetch result for the target node async processNode(node: QueryNode): Promise<void> { // 1. Ensure summarizer exists (created once per session) await engine.createSummarizer(DATA_EXPLORER_SUMMARIZER_ID); // 2. Register all queries with TP (handles change detection) const spec = buildTraceSummarySpec(allNodes); await engine.updateSummarizerSpec(DATA_EXPLORER_SUMMARIZER_ID, spec); // 3. Fetch result - triggers lazy materialization const result = await engine.querySummarizer(DATA_EXPLORER_SUMMARIZER_ID, node.nodeId); // Returns: tableName, rowCount, columns, durationMs, sql, textproto }

Auto-Execute Logic (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/query_execution_service.ts)

autoExecute manual Behavior
true false Analyze + execute automatically
true true Analyze + execute (forced)
false false Skip - show "Run Query" button
false true Analyze + execute (user clicked)

Auto-execute disabled for: SqlSourceNode, IntervalIntersectNode, UnionNode, FilterDuringNode, CreateSlicesNode

State Management

DataExplorerState (ui/src/plugins/dev.perfetto.DataExplorer/data_explorer.ts)

interface DataExplorerState { rootNodes: QueryNode[]; // Nodes without parents (starting points) selectedNodes: ReadonlySet<string>; // Set of selected node IDs (multi-selection) nodeLayouts: Map<string, {x, y}>; // Visual positions labels: Array<{...}>; // Annotations isExplorerCollapsed?: boolean; sidebarWidth?: number; loadGeneration?: number; // Incremented on content load clipboardNodes?: ClipboardEntry[]; // Multi-node copy/paste clipboardConnections?: ClipboardConnection[]; }

Query State Management (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/builder.ts:60-86)

Builder maintains this.query as the single source of truth for query state:

Query State Flow:

Automatic execution (autoExecute=true): NodePanel.updateQuery() → processNode({ manual: false }) → onAnalysisComplete → sets NodePanel.currentQuery → onAnalysisComplete → calls onQueryAnalyzed callback → sets Builder.query → Builder passes query as prop to NodePanel → NodePanel.renderContent() uses attrs.query ?? this.currentQuery Manual execution (autoExecute=false): User clicks "Run Query" → Builder calls processNode({ manual: true }) → onAnalysisComplete → sets Builder.query → onAnalysisComplete → calls onNodeQueryAnalyzed callback → sets Builder.query → Builder passes query as prop to NodePanel → NodePanel.renderContent() uses attrs.query (this.currentQuery may be undefined)

This ensures SQL/Proto tabs display correctly for both automatic and manual execution modes.

Race Condition Prevention (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/builder.ts:283-292)

The callback captures the selected node at creation time to prevent stale query leakage:

const callbackNode = selectedNode; this.onNodeQueryAnalyzed = (query) => { // Only update if still on the same node if (callbackNode === this.previousSelectedNode) { this.query = query; } };

Without this check, rapid node switching can cause:

  1. User selects Node A → async analysis starts
  2. User quickly switches to Node B → Node A's component destroyed
  3. Node A's analysis completes → callback fires with Node A's query
  4. Node B incorrectly displays Node A's query in SQL/Proto tabs

The validation ensures callbacks from old nodes are ignored after switching.

HistoryManager (ui/src/plugins/dev.perfetto.DataExplorer/history_manager.ts)

Graph Operations

Node Creation (ui/src/plugins/dev.perfetto.DataExplorer/node_crud_operations.ts)

// Source nodes addSourceNode(deps, state, id) { const descriptor = nodeRegistry.get(id); const initialState = await descriptor.preCreate?.(); // Optional modal const newNode = descriptor.factory(initialState); rootNodes.push(newNode); } // Operation nodes addOperationNode(deps, state, parentNode, id) { const newNode = descriptor.factory(initialState); if (singleNodeOperation(newNode.type)) { insertNodeBetween(parentNode, newNode); // A → C becomes A → B → C } else { addConnection(parentNode, newNode); // Multi-input: just connect } }

Node Deletion (ui/src/plugins/dev.perfetto.DataExplorer/node_crud_operations.ts)

// Complex reconnection logic preserves data flow deleteNode(deps, state, node) { 1. await cleanupManager.cleanupNode(node); // Drop SQL tables 2. Capture graph structure (parent, children, port connections) 3. disconnectNodeFromGraph(node) 4. Reconnect primary parent to children (bypass deleted node) - Only primary connections (portIndex === undefined) - Secondary connections dropped (specific to deleted node) 5. Update root nodes (add orphaned nodes) 6. Transfer layouts to docked children 7. Notify affected nodes via onPrevNodesUpdated() }

Graph Traversal (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/graph_utils.ts)

Invalidation and Caching

TP-Managed Caching

Query hash caching and change detection is handled entirely by Trace Processor:

Lazy Materialization

Materialization is lazy - TP only materializes a query when querySummarizer() is called for that specific query. When updateSummarizerSpec() is called, all valid queries in the graph are registered with TP, but no SQL is executed. Only when querySummarizer(nodeId) is called does TP actually materialize that query (and its dependencies). This avoids unnecessary work for nodes the user hasn't viewed yet.

Smart Re-materialization Optimization

When queries are synced with TP via updateSummarizerSpec(), TP performs intelligent change detection and dependency tracking to minimize redundant work:

  1. Proto-based change detection: Each query's structured query proto bytes are hashed (not the generated SQL). This works correctly for queries with inner_query_id references, which cannot have their SQL generated independently.

  2. Dependency propagation: If query B depends on query A via inner_query_id, and A's proto changes, B must also be re-materialized even if B's proto is unchanged (because B's output depends on A's data). TP propagates this transitively through the entire dependency chain.

  3. Table-source substitution: For unchanged queries that are already materialized, TP substitutes them with simple table-source structured queries that reference the materialized table. When SQL is generated for changed queries, they reference these tables directly instead of re-expanding the full query chain.

Example: For chain A → B → C → D, if C changes:

This optimization significantly speeds up incremental edits in long query chains by avoiding redundant SQL generation and execution. The TP-side implementation lives in src/trace_processor/trace_summary/summarizer.cc.

On-Demand State Queries

The UI queries materialization state from TP when needed:

// Get current state from TP (for "Copy Table Name", export, etc.) const result = await engine.querySummarizer(DATA_EXPLORER_SUMMARIZER_ID, nodeId); // Returns: { exists: boolean, tableName?: string, error?: string, ... }

This design ensures:

Trace Processor Restart Handling

If the Trace Processor restarts or crashes, all summarizer state (including materialized tables) is lost. The UI may still hold a stale summarizerId that no longer exists in TP. When the next querySummarizer() call is made, TP will return an error indicating the summarizer doesn't exist. The UI handles this gracefully by treating it as a need to re-create the summarizer and re-sync all queries on the next execution attempt. Users may see an error message, but clicking "Run Query" again will recover the state.

Structured Query Generation

Query Construction (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/query_builder_utils.ts)

getStructuredQueries(finalNode) { const queries: PerfettoSqlStructuredQuery[] = []; let currentNode = finalNode; // Walk up the graph from leaf to root while (currentNode) { queries.push(currentNode.getStructuredQuery()); currentNode = currentNode.primaryInput; // Follow primary input chain } return queries.reverse(); // Root → Leaf order } analyzeNode(node, engine) { const structuredQueries = getStructuredQueries(node); const spec = new TraceSummarySpec(); spec.query = structuredQueries; await engine.createSummarizer(ANALYZE_NODE_SUMMARIZER_ID); // Ensure summarizer exists await engine.updateSummarizerSpec(ANALYZE_NODE_SUMMARIZER_ID, spec); // Register with TP const result = await engine.querySummarizer(ANALYZE_NODE_SUMMARIZER_ID, node.nodeId); // Fetch result return {sql: result.sql, textproto: result.textproto}; }

Serialization and Examples

JSON Serialization (ui/src/plugins/dev.perfetto.DataExplorer/json_handler.ts)

Examples System (ui/src/plugins/dev.perfetto.DataExplorer/examples_modal.ts)

Key Architectural Patterns

1. Node-Based Query Building

All queries constructed via composable nodes:

2. Bidirectional Graph Connections

Nodes maintain both forward and backward links:

3. Two-Phase Execution with Lazy Materialization

4. FIFO Queue with TP-Managed State

5. Structured Query Protocol

6. Modular Pure-Function Architecture

data_explorer.ts delegates business logic to focused modules of pure functions:

Modules:

7. GraphCallbacks Interface (Prop Drilling Reduction)

14 callbacks flow from data_explorer.tsBuilderGraph:

File Path Reference

Core Infrastructure:

Business Logic Modules (pure functions with explicit dependency injection):

Node System:

UI Components:

Utilities:

Trace Processor (C++):