Debugging memory usage on Android

Prerequisites

If you are profiling your own app and are not running a userdebug build of Android, your app needs to be marked as profileable or debuggable in its manifest. See the heapprofd documentation for more details on which applications can be targeted.

dumpsys meminfo

A good place to get started investigating memory usage of a process is dumpsys meminfo which gives a high-level overview of how much of the various types of memory are being used by a process.

$ adb shell dumpsys meminfo com.android.systemui Applications Memory Usage (in Kilobytes): Uptime: 2030149 Realtime: 2030149 ** MEMINFO in pid 1974 [com.android.systemui] ** Pss Private Private SwapPss Rss Heap Heap Heap Total Dirty Clean Dirty Total Size Alloc Free ------ ------ ------ ------ ------ ------ ------ ------ Native Heap 16840 16804 0 6764 19428 34024 25037 5553 Dalvik Heap 9110 9032 0 136 13164 36444 9111 27333 [more stuff...]

Looking at the "Private Dirty" column of Dalvik Heap (= Java Heap) and Native Heap, we can see that SystemUI's memory usage on the Java heap is 9M, on the native heap it's 17M.

Linux memory management

But what does clean, dirty, Rss, Pss, Swap actually mean? To answer this question, we need to delve into Linux memory management a bit.

From the kernel's point of view, memory is split into equally sized blocks called pages. These are generally 4KiB.

Pages are organized in virtually contiguous ranges called VMA (Virtual Memory Area).

VMAs are created when a process requests a new pool of memory pages through the mmap() system call. Applications rarely call mmap() directly. Those calls are typically mediated by the allocator, malloc()/operator new() for native processes or by the Android RunTime for Java apps.

VMAs can be of two types: file-backed and anonymous.

File-backed VMAs are a view of a file in memory. They are obtained passing a file descriptor to mmap(). The kernel will serve page faults on the VMA through the passed file, so reading a pointer to the VMA becomes the equivalent of a read() on the file. File-backed VMAs are used, for instance, by the dynamic linker (ld) when executing new processes or dynamically loading libraries, or by the Android framework, when loading a new .dex library or accessing resources in the APK.

Anonymous VMAs are memory-only areas not backed by any file. This is the way allocators request dynamic memory from the kernel. Anonymous VMAs are obtained calling mmap(... MAP_ANONYMOUS ...).

Physical memory is only allocated, in page granularity, once the application tries to read/write from a VMA. If you allocate 32 MiB worth of pages but only touch one byte, your process' memory usage will only go up by 4KiB. You will have increased your process' virtual memory by 32 MiB, but its resident physical memory by 4 KiB.

When optimizing memory use of programs, we are interested in reducing their footprint in physical memory. High virtual memory use is generally not a cause for concern on modern platforms (except if you run out of address space, which is very hard on 64 bit systems).

We call the amount a process' memory that is resident in physical memory its RSS (Resident Set Size). Not all resident memory is equal though.

From a memory-consumption viewpoint, individual pages within a VMA can have the following states:

It is generally more important to reduce the amount of dirty memory as that cannot be reclaimed like clean memory and, on Android, even if swapped in ZRAM, will still eat part of the system memory budget. This is why we looked at Private Dirty in the dumpsys meminfo example.

Shared memory can be mapped into more than one process. This means VMAs in different processes refer to the same physical memory. This typically happens with file-backed memory of commonly used libraries (e.g., libc.so, framework.dex) or, more rarely, when a process fork()s and a child process inherits dirty memory from its parent.

This introduces the concept of PSS (Proportional Set Size). In PSS, memory that is resident in multiple processes is proportionally attributed to each of them. If we map one 4KiB page into four processes, each of their PSS will increase by 1KiB.

Recap

Memory over time

dumpsys meminfo is good to get a snapshot of the current memory usage, but even very short memory spikes can lead to low-memory situations, which will lead to LMKs. We have two tools to investigate situations like this

RSS High Watermark

We can get a lot of information from the /proc/[pid]/status file, including memory information. VmHWM shows the maximum RSS usage the process has seen since it was started. This value is kept updated by the kernel.

$ adb shell cat '/proc/$(pidof com.android.systemui)/status' [...] VmHWM: 256972 kB VmRSS: 195272 kB RssAnon: 30184 kB RssFile: 164420 kB RssShmem: 668 kB VmSwap: 43960 kB [...]

Memory tracepoints

NOTE: For detailed instructions about the memory trace points see the Data sources > Memory > Counters and events page.

We can use Perfetto to get information about memory management events from the kernel.

$ adb shell perfetto \ -c - --txt \ -o /data/misc/perfetto-traces/trace \ <<EOF buffers: { size_kb: 8960 fill_policy: DISCARD } buffers: { size_kb: 1280 fill_policy: DISCARD } data_sources: { config { name: "linux.process_stats" target_buffer: 1 process_stats_config { scan_all_processes_on_start: true } } } data_sources: { config { name: "linux.ftrace" ftrace_config { ftrace_events: "mm_event/mm_event_record" ftrace_events: "kmem/rss_stat" ftrace_events: "kmem/ion_heap_grow" ftrace_events: "kmem/ion_heap_shrink" } } } duration_ms: 30000 EOF

While it is running, take a photo if you are following along.

Pull the file using adb pull /data/misc/perfetto-traces/trace ~/mem-trace and upload to the Perfetto UI. This will show overall stats about system ION usage, and per-process stats to expand. Scroll down (or Ctrl-F for) to com.google.android.GoogleCamera and expand. This will show a timeline for various memory stats for camera.

Camera Memory Trace

We can see that around 2/3 into the trace, the memory spiked (in the mem.rss.anon track). This is where I took a photo. This is a good way to see how the memory usage of an application reacts to different triggers.

Which tool to use

If you want to drill down into anonymous memory allocated by Java code, labeled by dumpsys meminfo as Dalvik Heap, see the Analyzing the java heap section.

If you want to drill down into anonymous memory allocated by native code, labeled by dumpsys meminfo as Native Heap, see the Analyzing the Native Heap section. Note that it's frequent to end up with native memory even if your app doesn't have any C/C++ code. This is because the implementation of some framework API (e.g. Regex) is internally implemented through native code.

If you want to drill down into file-mapped memory the best option is to use adb shell showmap PID (on Android) or inspect /proc/PID/smaps.

Low-memory kills

When an Android device becomes low on memory, a daemon called lmkd will start killing processes in order to free up memory. Devices' strategies differ, but in general processes will be killed in order of descending oom_score_adj score (i.e. background apps and processes first, foreground processes last).

Apps on Android are not killed when switching away from them. They instead remain cached even after the user finishes using them. This is to make subsequent starts of the app faster. Such apps will generally be killed first (because they have a higher oom_score_adj).

We can collect information about LMKs and oom_score_adj using Perfetto.

$ adb shell perfetto \ -c - --txt \ -o /data/misc/perfetto-traces/trace \ <<EOF buffers: { size_kb: 8960 fill_policy: DISCARD } buffers: { size_kb: 1280 fill_policy: DISCARD } data_sources: { config { name: "linux.process_stats" target_buffer: 1 process_stats_config { scan_all_processes_on_start: true } } } data_sources: { config { name: "linux.ftrace" ftrace_config { ftrace_events: "lowmemorykiller/lowmemory_kill" ftrace_events: "oom/oom_score_adj_update" ftrace_events: "ftrace/print" atrace_apps: "lmkd" } } } duration_ms: 60000 EOF

Pull the file using adb pull /data/misc/perfetto-traces/trace ~/oom-trace and upload to the Perfetto UI.

OOM Score

We can see that the OOM score of Camera gets reduced (making it less likely to be killed) when it is opened, and gets increased again once it is closed.

Analyzing the Native Heap

Native Heap Profiles require Android 10.

NOTE: For detailed instructions about the native heap profiler and troubleshooting see the Data sources > Heap profiler page.

Applications usually get memory through malloc or C++'s new rather than directly getting it from the kernel. The allocator makes sure that your memory is more efficiently handled (i.e. there are not many gaps) and that the overhead from asking the kernel remains low.

We can log the native allocations and frees that a process does using heapprofd. The resulting profile can be used to attribute memory usage to particular function callstacks, supporting a mix of both native and Java code. The profile will only show allocations done while it was running, any allocations done before will not be shown.

Capturing the profile

Use the tools/heap_profile script to profile a process. If you are having trouble make sure you are using the latest version. See all the arguments using tools/heap_profile -h, or use the defaults and just profile a process (e.g. system_server):

$ tools/heap_profile -n system_server Profiling active. Press Ctrl+C to terminate. You may disconnect your device. Wrote profiles to /tmp/profile-1283e247-2170-4f92-8181-683763e17445 (symlink /tmp/heap_profile-latest) These can be viewed using pprof. Googlers: head to pprof/ and upload them.

When you see Profiling active, play around with the phone a bit. When you are done, press Ctrl-C to end the profile. For this tutorial, I opened a couple of apps.

Viewing the data

Then upload the raw-trace file from the output directory to the Perfetto UI and click on diamond marker that shows.

Profile Diamond

The tabs that are available are

The default view will show you all allocations that were done while the profile was running but that weren't freed (the space tab).

Native Flamegraph

We can see that a lot of memory gets allocated in paths through AssetManager.applyStyle. To get the total memory that was allocated this way, we can enter "applyStyle" into the Focus textbox. This will only show callstacks where some frame matches "applyStyle".

Native Flamegraph with Focus

From this we have a clear idea where in the code we have to look. From the code we can see how that memory is being used and if we actually need all of it.

Analyzing the Java Heap

Java Heap Dumps require Android 11.

NOTE: For detailed instructions about capturing Java heap dumps and troubleshooting see the Data sources > Java heap dumps page.

Dumping the java heap

We can get a snapshot of the graph of all the Java objects that constitute the Java heap. We use the tools/java_heap_dump script. If you are having trouble make sure you are using the latest version.

$ tools/java_heap_dump -n com.android.systemui Dumping Java Heap. Wrote profile to /tmp/tmpup3QrQprofile This can be viewed using https://ui.perfetto.dev.

Viewing the Data

Upload the trace to the Perfetto UI and click on diamond marker that shows.

Profile Diamond

This will present a set of flamegraph views as explained below.

"Size" and "Objects" tabs

Java Flamegraph: Size

These views show the memory attributed to the shortest path to a garbage-collection root. In general an object is reachable by many paths, we only show the shortest as that reduces the complexity of the data displayed and is generally the highest-signal. The rightmost [merged] stacks is the sum of all objects that are too small to be displayed.

If we want to only see callstacks that have a frame that contains some string, we can use the Focus feature. If we want to know all allocations that have to do with notifications, we can put "notification" in the Focus box.

As with native heap profiles, if we want to focus on some specific aspect of the graph, we can filter by the names of the classes. If we wanted to see everything that could be caused by notifications, we can put "notification" in the Focus box.

Java Flamegraph with Focus

We aggregate the paths per class name, so if there are multiple objects of the same type retained by a java.lang.Object[], we will show one element as its child, as you can see in the leftmost stack above. This also applies to the dominator tree paths as described below.

"Dominated Size" and "Dominated Objects" tabs

Java Flamegraph: Dominated Size

Another way to present the heap graph as a flamegraph (a tree) is to show its dominator tree. In a heap graph, an object a dominates an object b if b is reachable from the root only via paths that go through a. The dominators of an object form a chain from the root and the object is exclusvely retained by all objects on this chain. For all reachable objects in the graph those chains form a tree, i.e. the dominator tree.

We aggregate the tree paths per class name, and each element (tree node) represents a set of objects that have the same class name and position in the dominator tree.