Batch Trace Processor
The Batch Trace Processor is a Python library wrapping the Trace Processor: it allows fast (<1s) interactive queries on large sets (up to ~1000) of traces.
Installation
Batch Trace Processor is part of the perfetto
Python library and can be
installed by running:
pip3 install pandas # prerequisite for Batch Trace Processor
pip3 install perfetto
Loading traces
NOTE: if you are a Googler, have a look at go/perfetto-btp-load-internal for how to load traces from Google-internal sources.
The simplest way to load traces in is by passing a list of file paths to load:
from perfetto.batch_trace_processor.api import BatchTraceProcessor
files = [
'traces/slow-start.pftrace',
'traces/oom.pftrace',
'traces/high-battery-drain.pftrace',
]
with BatchTraceProcessor(files) as btp:
btp.query('...')
glob can be used to load all traces in a directory:
from perfetto.batch_trace_processor.api import BatchTraceProcessor
files = glob.glob('traces/*.pftrace')
with BatchTraceProcessor(files) as btp:
btp.query('...')
NOTE: loading too many traces can cause out-of-memory issues: see this section for details.
A common requirement is to load traces located in the cloud or by sending a request to a server. To support this usecase, traces can also be loaded using trace URIs:
from perfetto.batch_trace_processor.api import BatchTraceProcessor
from perfetto.batch_trace_processor.api import BatchTraceProcessorConfig
from perfetto.trace_processor.api import TraceProcessorConfig
from perfetto.trace_uri_resolver.registry import ResolverRegistry
from perfetto.trace_uri_resolver.resolver import TraceUriResolver
class FooResolver(TraceUriResolver):
# See "Trace URIs" section below for how to implement a URI resolver.
config = BatchTraceProcessorConfig(
# See "Trace URIs" below
)
with BatchTraceProcessor('foo:bar=1,baz=abc', config=config) as btp:
btp.query('...')
Writing queries
Writing queries with batch trace processor works very similarly to the Python API.
For example, to get a count of the number of userspace slices:
'select count(1) from slice')
[ count(1)
0 2092592, count(1)
0 156071, count(1)
0 121431]
btp.query(The return value of query
is a list of Pandas
dataframes, one for each trace loaded.
A common requirement is for all of the traces to be flattened into a
single dataframe instead of getting one dataframe per-trace. To support this,
the query_and_flatten
function can be used:
'select count(1) from slice')
count(1)
0 2092592
1 156071
2 121431
btp.query_and_flatten(query_and_flatten
also implicitly adds columns indicating the originating
trace. The exact columns added depend on the resolver being used: consult your
resolver's documentation for more information.
Trace URIs
Trace URIs are a powerful feature of the batch trace processor. URIs decouple the notion of "paths" to traces from the filesystem. Instead, the URI describes how a trace should be fetched (i.e. by sending a HTTP request to a server, from cloud storage etc).
The syntax of trace URIs are similar to web URLs. Formally a trace URI has the structure:
Trace URI = protocol:key1=val1(;keyn=valn)*
As an example:
gcs:bucket=foo;path=bar
would indicate that traces should be fetched using the protocol gcs
(Google Cloud Storage) with traces
located at bucket foo
and path bar
in the bucket.
NOTE: the gcs
resolver is not actually included: it's simply given as its
an easy to understand example.
URIs are only a part of the puzzle: ultimately batch trace processor still needs the bytes of the traces to be able to parse and query them. The job of converting URIs to trace bytes is left to resolvers - Python classes associated to each protocol and use the key-value pairs in the URI to lookup the traces to be parsed.
By default, batch trace processor only ships with a single resolver which knows how to lookup filesystem paths: however, custom resolvers can be easily created and registered. See the documentation on the TraceUriResolver class for information on how to do this.
Memory usage
Memory usage is a very important thing to pay attention to working with batch trace processor. Every trace loaded lives fully in memory: this is magic behind making queries fast (<1s) even on hundreds of traces.
This also means that the number of traces you can load is heavily limited by the amount of memory available available. As a rule of thumb, if your average trace size is S and you are trying to load N traces, you will have 2 * S * N memory usage. Note that this can vary significantly based on the exact contents and sizes of your trace.
Advanced features
Sharing computations between TP and BTP
Sometimes it can be useful to parameterise code to work with either trace
processor or batch trace processor. execute
or execute_and_flatten
can be used for this purpose:
def some_complex_calculation(tp):
res = tp.query('...').as_pandas_dataframe()
# ... do some calculations with res
return res
# |some_complex_calculation| can be called with a [TraceProcessor] object:
tp = TraceProcessor('/foo/bar.pftrace')
some_complex_calculation(tp)
# |some_complex_calculation| can also be passed to |execute| or
# |execute_and_flatten|
btp = BatchTraceProcessor(['...', '...', '...'])
# Like |query|, |execute| returns one result per trace. Note that the returned
# value *does not* have to be a Pandas dataframe.
[a, b, c] = btp.execute(some_complex_calculation)
# Like |query_and_flatten|, |execute_and_flatten| merges the Pandas dataframes
# returned per trace into a single dataframe, adding any columns requested by
# the resolver.
flattened_res = btp.execute_and_flatten(some_complex_calculation)