I am trying to profile a C application on an embedded device. Using Vtune, I found out that the app is launching hundreds of threads, among which most are active for only small percentage of the total time.
I want to get a details of the context switches that are happening (preferably in some kind of a timeline view). I have yet to come across a tool that can show the contact switch information. Is there some kind of profiler that provides this? Or some other way to get this info?
Thanks.
CodePudding user response:
On Linux you could use cat /proc/{PID}/status
, to get some information on threads, voluntary_ctxt_switches and nonvoluntary_ctxt_switches
for example,
uname@hostname:/$ cat /proc/1357/status
Name: avahi-daemon
Umask: 0022
State: S (sleeping)
Tgid: 1357
Ngid: 0
Pid: 1357
PPid: 1
TracerPid: 0
Uid: 107 107 107 107
Gid: 114 114 114 114
FDSize: 128
Groups: 114
NStgid: 1357
NSpid: 1357
NSpgid: 1357
NSsid: 1357
VmPeak: 10500 kB
VmSize: 8288 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 3664 kB
VmRSS: 2700 kB
RssAnon: 328 kB
RssFile: 2372 kB
RssShmem: 0 kB
VmData: 468 kB
VmStk: 132 kB
VmExe: 92 kB
VmLib: 3720 kB
VmPTE: 52 kB
VmSwap: 0 kB
HugetlbPages: 0 kB
CoreDumping: 0
Threads: 1
SigQ: 0/31668
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000001000
SigCgt: 0000000180004203
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
Speculation_Store_Bypass: thread vulnerable
Cpus_allowed: ffffffff,ffffffff,ffffffff,ffffffff
Cpus_allowed_list: 0-127
Mems_allowed: 00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 1610
nonvoluntary_ctxt_switches: 25
CodePudding user response:
This answer is specific for the Linux OS. It would be good if you specify what OS you are using because otherwise you may get the solution you don't need.
If you have Linux Perf events, you can get a visual timeline of the context switches in your application using perf timechart record
and perf timechart
. If the duration of the record is large it may take a while to process the result.
If you want to know what parts of your program are the culprit, maybe it would be better to use perf record -e context-switch --call-graph XXX
to sample the backtrace when a context switch happens. Look into the perf manual to see more details of the command line options. Once you collect some trace data, you can visualise it with perf report
. I believe Intel VTune is still able to open perf traces, but you need to rename the files from the default perf.data
to a file name ending with .perf
extension.