I have been spending some quality time with gdb tracking down timers firing during idle in Firefox, which turns out to be the main culprit for FF’s energy consumption. Wakeups are triggered either because a timer is fired or a process sleeping on a synchronisation variable is woken up. In fact the former can be reduced to the latter, since also the timer thread which executes the timer callbacks, blocks on a synchronisation variable.
Tracking down wakeups is nothing new and has been done in the past. Automating it on the other hand isn’t trivial without modifying heavily the sourcecode. I tried automating the process through a gdb script but it kept crashing randomly and when it didn’t FF got significantly slower so I gave up on that. There are some alternatives though, e.g. on Linux the perf subsystem can be exploited to track the following events:
- the context switches (shced:sched_switch),
- wakeups (sched:sched_wakeup),
- and the C-states of the CPU (power:cpu_idle).
By keeping track of the CPU C-states we know when the CPU is sleeping. The wakeup probe tells us when a specific thread is woken up, and by correlating it with the previous probe, it gives us the culprit that caused the CPU C-state to change. Finally, the context switch probe allows to retrieve a user-space callgraph of the thread woken up.
That’s an interesting approach and powertop does something similar but it turns out that to investigate wakeups we don’t really need that degree of accuracy. Instead of considering only the context switches caused by wakeups correlated to C-state changes, we can just consider all context switches. Some context switches will not be associated with C-state changes but this heuristic is still good enough on first approximation during an idle workload. On Linux the following command can be used to collect callgraphs of context switches:
As shown in the example above, most of the times a context switch happens we are either in the BackgroundHangManager, in the Timer thread or nsSocketTransportService. Not all timers are created equal though and we would really like to know which timers are firing. Unfortunately since the specific timer callback is executed only after the context switch, there isn’t a simple way to get that sort of information through perf.
The linker comes to our rescue and by using dladdr we can simply retrieve the mangled name of a timer callback handler, which we can demangle and print with __cxa_demangle. This clearly works only if the callbacks are exported in the dynamic symbol table, which on OSX are by default whereas on Linux we have to recompile the codebase with the -export-dynamic linker flag.