Data Explorer Architecture

This document explains how Perfetto's Data Explorer works, from creating visual query graphs to executing SQL queries and displaying results. It covers the key components, data flow, and architectural patterns that enable the Data Explorer to provide an interactive, node-based SQL query builder for trace analysis.

Overview

The Data Explorer is a visual query builder that allows users to construct complex SQL queries by connecting nodes in a directed acyclic graph (DAG). Each node represents either a data source (table, slices, custom SQL) or an operation (filter, aggregation, join, etc.). The system converts this visual graph into structured SQL queries, executes them via the trace processor, and displays results in an interactive data grid.

Core Data Flow

User Interaction → Node Graph → Structured Query Generation →
Query Analysis (Validation) → Query Materialization → Result Display

Node Graph Structure

QueryNode (ui/src/plugins/dev.perfetto.DataExplorer/query_node.ts:128-161)

Base abstraction for all node types
Maintains bidirectional connections: primaryInput (upstream), nextNodes (downstream), secondaryInputs (side connections)
Generates structured query protobuf via getStructuredQuery()
Validates configuration and provides UI rendering methods

Node Connections (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/graph_utils.ts)

Primary Input: Vertical data flow (single parent node)
Secondary Inputs: Horizontal data flow (side connections with port numbers)
Bidirectional relationship management via addConnection()/removeConnection()
Port-based routing for multi-input operations

Node Registration and Creation

NodeRegistry (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/node_registry.ts)

Central registry for all node types
Descriptors specify: name, icon, type (source/modification/multisource), factory function
Optional preCreate() hook for interactive setup (e.g., table selection modal)
Supports keyboard shortcuts for rapid node creation

Core Nodes (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/core_nodes.ts)

registerCoreNodes() {
  nodeRegistry.register('table', {...});
  nodeRegistry.register('slice', {...});
  nodeRegistry.register('sql', {...});
  nodeRegistry.register('filter', {...});
  nodeRegistry.register('aggregation', {...});
  // ... more nodes
}

Node Types

1. Source Nodes (Data Origin)

TableSourceNode - Queries a specific SQL table SlicesSourceNode - Pre-configured query for trace slices SqlSourceNode - Custom SQL query as data source TimeRangeSourceNode - Generates time intervals

2. Single-Input Modification Nodes

FilterNode - Adds WHERE conditions (autoExecute=false when in 'sql' mode) SortNode - Adds ORDER BY clauses AggregationNode - GROUP BY with aggregate functions ModifyColumnsNode - Renames/removes columns AddColumnsNode - Adds columns from secondary source via LEFT JOIN and/or computed expressions LimitAndOffsetNode - Pagination CounterToIntervalsNode - Converts counter events to time intervals MetricsNode - Runs pre-defined trace metrics VisualisationNode - Visualizes query output as a chart TraceSummaryNode - Renders trace summary data

3. Multi-Input Nodes

UnionNode - Combines rows from multiple sources JoinNode - Combines columns via JOIN conditions (autoExecute=false by default; switches to true when conditionType is 'equality') IntervalIntersectNode - Finds overlapping time intervals FilterDuringNode - Filters using secondary interval input CreateSlicesNode - Pairs start/end events from two secondary sources into slices

4. Grouping Nodes

GroupNode - Encapsulates an inner sub-graph, exposing it as a single node; inner connections are preserved through serialization

5. Dashboard Nodes

DashboardNode - Connects a query node output to a dashboard visualization

UI Components

Builder (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/builder.ts)

Main component coordinating all sub-components
Manages layout with resizable sidebar and split panel
Three views: Info, Modify (node-specific), Result
Handles node selection, execution callbacks, undo/redo
Receives GraphCallbacks interface and spreads it directly to Graph (no prop drilling)

Graph (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/graph/graph.ts)

Visual canvas for node manipulation
Drag-and-drop positioning with persistent layouts
Connection management via draggable ports
Label annotations for documentation
Defines GraphCallbacks interface (14 callbacks) and GraphAttrs extends GraphCallbacks

NodePanel (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/node_panel.ts)

Sidebar panel for selected node
Displays node info, configuration UI, and SQL preview
Triggers query analysis on state changes
Manages execution flow via QueryExecutionService

ResultsPanel (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/results_panel.ts)

Bottom drawer showing query results
Server-side pagination via SQLDataSource
Column-based filtering and sorting
Export to timeline functionality

Query Execution Model

Two-Phase Execution

Phase 1: Analysis (Validation)

Node Graph → Structured Query Protobuf → Engine.updateSummarizerSpec() + querySummarizer() →
Query {sql, textproto, columns} | Error

Creates summarizer via createSummarizer(summarizerId) (once per session)
Registers queries with TP via updateSummarizerSpec(summarizerId, spec)
Fetches SQL and metadata via querySummarizer(summarizerId, queryId) (triggers lazy materialization)
TP computes proto hash for change detection internally

Phase 2: Materialization (Execution)

engine.querySummarizer(summarizerId, nodeId) → TP creates/reuses table →
{tableName, rowCount, columns, durationMs} → SQLDataSource → DataGrid Display

TP creates persistent table for server-side pagination (lazy, on first querySummarizer)
TP handles caching internally (reuses table if proto hash unchanged)
querySummarizer returns all metadata needed for display

QueryExecutionService

Purpose (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/query_execution_service.ts)

Prevents race conditions during rapid user interaction via FIFO execution queue
Debounces rapid requests to batch user input
Coordinates with Trace Processor's materialization API
Query analysis (validation) before execution

Trace Processor as Single Source of Truth

All materialization state is managed by Trace Processor (TP), not the UI:

TP tracks which queries are materialized (by query_id)
TP compares SQL hashes internally to detect changes
TP creates/drops tables as needed
TP stores table names and error states

The UI queries TP on-demand instead of caching:

// Fetch table name from TP when needed (e.g., for "Copy Table Name" or export)
async getTableName(nodeId: string): Promise<string | undefined> {
  const result = await engine.querySummarizer(DATA_EXPLORER_SUMMARIZER_ID, nodeId);
  if (result.exists !== true || result.error) {
    return undefined;
  }
  return result.tableName;
}

This eliminates state synchronization bugs between UI and TP.

FIFO Execution Queue

Serialized execution (one operation at a time)
Preserves node dependencies (parent materializes before child)
Per-operation error isolation (errors are logged, not thrown)

Rapid Node Click Handling (ui/src/base/async_limiter.ts)

The AsyncLimiter ensures only the latest queued task runs when clicking nodes rapidly:

// AsyncLimiter behavior:
while ((task = taskQueue.shift())) {
  if (taskQueue.length > 0) {
    task.deferred.resolve();  // Skip - newer tasks waiting
  } else {
    await task.work();  // Run - this is the latest
  }
}

Example: Click A → B → C rapidly while A is processing:

A starts processing
B queued, C queued
A finishes
B skipped (queue has C), C runs

This ensures the currently selected node (C) is processed, intermediate clicks (B) are skipped.

Materialization via TP API

// Sync all queries with TP, then fetch result for the target node
async processNode(node: QueryNode): Promise<void> {
  // 1. Ensure summarizer exists (created once per session)
  await engine.createSummarizer(DATA_EXPLORER_SUMMARIZER_ID);

  // 2. Register all queries with TP (handles change detection)
  const spec = buildTraceSummarySpec(allNodes);
  await engine.updateSummarizerSpec(DATA_EXPLORER_SUMMARIZER_ID, spec);

  // 3. Fetch result - triggers lazy materialization
  const result = await engine.querySummarizer(DATA_EXPLORER_SUMMARIZER_ID, node.nodeId);
  // Returns: tableName, rowCount, columns, durationMs, sql, textproto
}

Auto-Execute Logic (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/query_execution_service.ts)

autoExecute	manual	Behavior
true	false	Analyze + execute automatically
true	true	Analyze + execute (forced)
false	false	Skip - show "Run Query" button
false	true	Analyze + execute (user clicked)

Auto-execute disabled for: SqlSourceNode (always), JoinNode (default; switches to true for equality joins), FilterNode (when in 'sql' mode)

State Management

DataExplorerState (ui/src/plugins/dev.perfetto.DataExplorer/data_explorer.ts)

interface DataExplorerState {
  rootNodes: QueryNode[];                  // Nodes without parents (starting points)
  selectedNodes: ReadonlySet<string>;      // Set of selected node IDs (multi-selection)
  nodeLayouts: Map<string, {x, y}>;       // Visual positions
  labels: Array<{...}>;                   // Annotations
  isExplorerCollapsed?: boolean;
  sidebarWidth?: number;
  loadGeneration?: number;                // Incremented on content load
  clipboardNodes?: ClipboardEntry[];      // Multi-node copy/paste
  clipboardConnections?: ClipboardConnection[];
}

Query State Management (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/builder.ts:60-86)

Builder maintains this.query as the single source of truth for query state:

Updated by both automatic analysis (from NodePanel) and manual execution (from Builder)
Passed to NodePanel as a prop for rendering SQL/Proto tabs
Ensures consistent query display for both autoExecute=true and autoExecute=false nodes

Query State Flow:

Automatic execution (autoExecute=true):
  NodePanel.updateQuery() → processNode({ manual: false })
  → onAnalysisComplete → sets NodePanel.currentQuery
  → onAnalysisComplete → calls onQueryAnalyzed callback → sets Builder.query
  → Builder passes query as prop to NodePanel
  → NodePanel.renderContent() uses attrs.query ?? this.currentQuery

Manual execution (autoExecute=false):
  User clicks "Run Query" → Builder calls processNode({ manual: true })
  → onAnalysisComplete → sets Builder.query
  → onAnalysisComplete → calls onNodeQueryAnalyzed callback → sets Builder.query
  → Builder passes query as prop to NodePanel
  → NodePanel.renderContent() uses attrs.query (this.currentQuery may be undefined)

This ensures SQL/Proto tabs display correctly for both automatic and manual execution modes.

Race Condition Prevention (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/builder.ts:283-292)

The callback captures the selected node at creation time to prevent stale query leakage:

const callbackNode = selectedNode;
this.onNodeQueryAnalyzed = (query) => {
  // Only update if still on the same node
  if (callbackNode === this.previousSelectedNode) {
    this.query = query;
  }
};

Without this check, rapid node switching can cause:

User selects Node A → async analysis starts
User quickly switches to Node B → Node A's component destroyed
Node A's analysis completes → callback fires with Node A's query
Node B incorrectly displays Node A's query in SQL/Proto tabs

The validation ensures callbacks from old nodes are ignored after switching.

HistoryManager (ui/src/plugins/dev.perfetto.DataExplorer/history_manager.ts)

Undo/redo stack with state snapshots
Serialization via serializeState() for each node
Deserialization reconstructs entire graph from JSON

Graph Operations

Node Creation (ui/src/plugins/dev.perfetto.DataExplorer/node_crud_operations.ts)

// Source nodes
addSourceNode(deps, state, id) {
  const descriptor = nodeRegistry.get(id);
  const initialState = await descriptor.preCreate?.();  // Optional modal
  const newNode = descriptor.factory(initialState);
  rootNodes.push(newNode);
}

// Operation nodes
addOperationNode(deps, state, parentNode, id) {
  const newNode = descriptor.factory(initialState);
  if (singleNodeOperation(newNode.type)) {
    insertNodeBetween(parentNode, newNode);  // A → C becomes A → B → C
  } else {
    addConnection(parentNode, newNode);       // Multi-input: just connect
  }
}

Node Deletion (ui/src/plugins/dev.perfetto.DataExplorer/node_crud_operations.ts)

// Complex reconnection logic preserves data flow
deleteNode(deps, state, node) {
  1. await cleanupManager.cleanupNode(node);  // Drop SQL tables
  2. Capture graph structure (parent, children, port connections)
  3. disconnectNodeFromGraph(node)
  4. Reconnect primary parent to children (bypass deleted node)
     - Only primary connections (portIndex === undefined)
     - Secondary connections dropped (specific to deleted node)
  5. Update root nodes (add orphaned nodes)
  6. Transfer layouts to docked children
  7. Notify affected nodes via onPrevNodesUpdated()
}

Graph Traversal (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/graph_utils.ts)

getAllNodes(): BFS traversal (both forward and backward)
getAllDownstreamNodes(): Forward traversal (for invalidation)
getAllUpstreamNodes(): Backward traversal (for dependency checking)
insertNodeBetween(): Rewires connections when inserting operations

Invalidation and Caching

TP-Managed Caching

Query hash caching and change detection is handled entirely by Trace Processor:

TP computes and stores proto hashes for each materialized query
When updateSummarizerSpec() is called, TP compares new hash to stored hash
If unchanged, TP returns existing table name without re-execution
If changed, TP drops old table and creates new one

Lazy Materialization

Materialization is lazy - TP only materializes a query when querySummarizer() is called for that specific query. When updateSummarizerSpec() is called, all valid queries in the graph are registered with TP, but no SQL is executed. Only when querySummarizer(nodeId) is called does TP actually materialize that query (and its dependencies). This avoids unnecessary work for nodes the user hasn't viewed yet.

Smart Re-materialization Optimization

When queries are synced with TP via updateSummarizerSpec(), TP performs intelligent change detection and dependency tracking to minimize redundant work:

Proto-based change detection: Each query's structured query proto bytes are hashed (not the generated SQL). This works correctly for queries with inner_query_id references, which cannot have their SQL generated independently.
Dependency propagation: If query B depends on query A via inner_query_id, and A's proto changes, B must also be re-materialized even if B's proto is unchanged (because B's output depends on A's data). TP propagates this transitively through the entire dependency chain.
Table-source substitution: For unchanged queries that are already materialized, TP substitutes them with simple table-source structured queries that reference the materialized table. When SQL is generated for changed queries, they reference these tables directly instead of re-expanding the full query chain.

Example: For chain A → B → C → D, if C changes:

A, B: Unchanged, use existing materialized tables (_exp_mat_0, _exp_mat_1)
C: Changed, re-materialize (SQL references B's materialized table directly)
D: Transitively changed (depends on C), re-materialize (SQL references C's new table)

This optimization significantly speeds up incremental edits in long query chains by avoiding redundant SQL generation and execution. The TP-side implementation lives in src/trace_processor/trace_summary/summarizer.cc.

On-Demand State Queries

The UI queries materialization state from TP when needed:

// Get current state from TP (for "Copy Table Name", export, etc.)
const result = await engine.querySummarizer(DATA_EXPLORER_SUMMARIZER_ID, nodeId);
// Returns: { exists: boolean, tableName?: string, error?: string, ... }

This design ensures:

No UI-side state can become stale or out of sync with TP
TP is the authoritative source for all materialization state
Simpler UI code with no cache invalidation logic

Trace Processor Restart Handling

If the Trace Processor restarts or crashes, all summarizer state (including materialized tables) is lost. The UI may still hold a stale summarizerId that no longer exists in TP. When the next querySummarizer() call is made, TP will return an error indicating the summarizer doesn't exist. The UI handles this gracefully by treating it as a need to re-create the summarizer and re-sync all queries on the next execution attempt. Users may see an error message, but clicking "Run Query" again will recover the state.

Structured Query Generation

Query Construction (ui/src/plugins/dev.perfetto.DataExplorer/query_builder/query_builder_utils.ts)

getStructuredQueries(finalNode) {
  const queries: PerfettoSqlStructuredQuery[] = [];
  let currentNode = finalNode;

  // Walk up the graph from leaf to root
  while (currentNode) {
    queries.push(currentNode.getStructuredQuery());
    currentNode = currentNode.primaryInput;  // Follow primary input chain
  }

  return queries.reverse();  // Root → Leaf order
}

analyzeNode(node, engine) {
  const structuredQueries = getStructuredQueries(node);
  const spec = new TraceSummarySpec();
  spec.query = structuredQueries;
  await engine.createSummarizer(ANALYZE_NODE_SUMMARIZER_ID);  // Ensure summarizer exists
  await engine.updateSummarizerSpec(ANALYZE_NODE_SUMMARIZER_ID, spec);  // Register with TP
  const result = await engine.querySummarizer(ANALYZE_NODE_SUMMARIZER_ID, node.nodeId);  // Fetch result
  return {sql: result.sql, textproto: result.textproto};
}

Serialization and Examples

JSON Serialization (ui/src/plugins/dev.perfetto.DataExplorer/json_handler.ts)

exportStateAsJson(): Serializes entire graph state to JSON file
deserializeState(): Reconstructs graph from JSON
Each node implements serializeState() for node-specific state
Used for: Import/Export, Examples, Undo/Redo snapshots

Examples System (ui/src/plugins/dev.perfetto.DataExplorer/examples_modal.ts)

Pre-built graphs stored as JSON in ui/src/assets/data_explorer/
Base page state auto-loaded on first visit
Modal allows users to load curated examples

Key Architectural Patterns

1. Node-Based Query Building

All queries constructed via composable nodes:

Sources provide initial data (tables, slices, custom SQL)
Operations transform data (filter, aggregate, join)
Nodes connected via drag-and-drop visual interface
Graph structure maps directly to SQL query structure

2. Bidirectional Graph Connections

Nodes maintain both forward and backward links:

primaryInput: Single parent (vertical data flow)
secondaryInputs: Map of port → parent (side connections)
nextNodes: Array of children (consumers of this node's output)
Graph operations maintain consistency across all links

3. Two-Phase Execution with Lazy Materialization

Analysis phase: Validate query structure without execution
Execution phase: Materialize into PERFETTO table for pagination
Lazy materialization: only materialize selected node and its upstream dependencies
TP manages table caching internally (reuses when proto hash unchanged)
Smart re-materialization: unchanged parent queries use table-source substitution
Server-side pagination via SQLDataSource (no full result fetch)

4. FIFO Queue with TP-Managed State

Prevents race conditions during rapid user input
Operations execute in order (preserves node dependencies)
Per-operation error isolation (one failure doesn't block queue)
TP handles all caching/change detection internally
UI queries TP on-demand for table names (no UI-side caching)

5. Structured Query Protocol

Nodes generate protobuf PerfettoSqlStructuredQuery
Engine validates and converts to SQL via updateSummarizerSpec() + querySummarizer()
Hash-based change detection (proto bytes hashed by TP)
Enables query analysis without SQL string manipulation

6. Modular Pure-Function Architecture

data_explorer.ts delegates business logic to focused modules of pure functions:

Each module defines a Deps interface for its required dependencies
Functions receive dependencies explicitly (no class this access)
data_explorer.ts constructs deps objects and delegates to module functions
Enables testing, reuse, and clear responsibility boundaries

Modules:

node_crud_operations.ts — Node add/delete/duplicate/connect/disconnect (NodeCrudDeps)
datagrid_node_creation.ts — Node creation triggered from DataGrid interactions (DatagridNodeCreationDeps)
clipboard_operations.ts — Multi-node copy/paste
graph_io.ts — Import/export, graph loading, template initialization (GraphIODeps)
node_actions.ts — Closure-based callbacks for node→graph interaction (NodeActionHandlers)

7. GraphCallbacks Interface (Prop Drilling Reduction)

16 callbacks flow from data_explorer.ts → Builder → Graph (14 required, 2 optional):

GraphCallbacks interface defined in graph.ts groups all 16 callbacks
BuilderAttrs has a single graphCallbacks: GraphCallbacks field
Builder spreads ...attrs.graphCallbacks directly into Graph component
Eliminates manual forwarding of each callback through Builder

File Path Reference

Core Infrastructure:

ui/src/plugins/dev.perfetto.DataExplorer/index.ts - Plugin entry point, lifecycle hooks, route registration, localStorage/permalink persistence
ui/src/plugins/dev.perfetto.DataExplorer/data_explorer.ts - Main component, tab management, state management, keyboard handling, deps construction
ui/src/plugins/dev.perfetto.DataExplorer/query_node.ts - Node abstraction and type definitions
ui/src/plugins/dev.perfetto.DataExplorer/query_builder/builder.ts - Main UI component (receives GraphCallbacks)
ui/src/plugins/dev.perfetto.DataExplorer/query_builder/query_execution_service.ts - Execution coordination

Business Logic Modules (pure functions with explicit dependency injection):

ui/src/plugins/dev.perfetto.DataExplorer/node_crud_operations.ts - Node add/delete/duplicate/connect/disconnect
ui/src/plugins/dev.perfetto.DataExplorer/datagrid_node_creation.ts - Node creation triggered from DataGrid interactions
ui/src/plugins/dev.perfetto.DataExplorer/clipboard_operations.ts - Multi-node copy/paste
ui/src/plugins/dev.perfetto.DataExplorer/graph_io.ts - Import/export, graph loading, template initialization
ui/src/plugins/dev.perfetto.DataExplorer/node_actions.ts - Closure-based callbacks for node→graph interaction

Node System:

ui/src/plugins/dev.perfetto.DataExplorer/query_builder/node_registry.ts - Node registration
ui/src/plugins/dev.perfetto.DataExplorer/query_builder/core_nodes.ts - Core node registration
ui/src/plugins/dev.perfetto.DataExplorer/query_builder/nodes/ - Individual node implementations

UI Components:

ui/src/plugins/dev.perfetto.DataExplorer/query_builder/graph/graph.ts - Visual graph canvas (defines GraphCallbacks)
ui/src/plugins/dev.perfetto.DataExplorer/query_builder/node_panel.ts - Node sidebar
ui/src/plugins/dev.perfetto.DataExplorer/query_builder/results_panel.ts - Results drawer (server-side pagination, column/filter/sort)

Utilities:

ui/src/plugins/dev.perfetto.DataExplorer/query_builder/graph_utils.ts - Graph traversal and connection management
ui/src/plugins/dev.perfetto.DataExplorer/query_builder/query_builder_utils.ts - Query analysis and utilities
ui/src/plugins/dev.perfetto.DataExplorer/query_builder/cleanup_manager.ts - Resource cleanup
ui/src/plugins/dev.perfetto.DataExplorer/history_manager.ts - Undo/redo management
ui/src/plugins/dev.perfetto.DataExplorer/json_handler.ts - Serialization

Trace Processor (C++):

src/trace_processor/trace_summary/summarizer.cc - Smart re-materialization with change detection and dependency propagation
src/trace_processor/trace_summary/summarizer.h - Summarizer class definition and QueryState
src/trace_processor/perfetto_sql/generator/structured_query_generator.cc - SQL generation from structured queries