Version 2 (modified by cotto, 13 years ago) |
---|
Introduction
Callgrind's output format is very general because it's targeted by a number of different tools. This page describes Callgrind's output format as it pertains to PIR-level profiling for Parrot.
Terms
- Process under test - This is the process of the program being profiled, i.e. a PIR program of some sort.
- Profile - This refers to the output file containing profiling information for the process under test.
Format Description
The format consists of a header and a body. The header consists of several key value pairs which describe the purpose of the profile. KCachegrind ignores lines it doesn't know about. In the example below, this feature is used to add comments to an example profile. See the bottom of this page for information about how this example was generated.
Example
Example C Code
The following C code was used to generate a (nearly) trivial profile:
#include <stdio.h> #include <valgrind/callgrind.h> void func1(); void func2(); void func3(); int main() { CALLGRIND_START_INSTRUMENTATION; func1(); func2(); func3(); CALLGRIND_STOP_INSTRUMENTATION; } void func1() { int i; for (i = 0; i < 1000000; i++); } void func2() { int i; for (i = 0; i < 1000000; i++); func1(); } void func3() { int i; for (i = 0; i < 1000000; i++); func2(); func1(); }
The program was compiled and profiled as follows:
gcc callgrind_test.c -g -o callgrind_test valgrind --tool=callgrind --compress-strings=no --compress-pos=no --instr-atstart=no ./callgrind_test
Annotated Header from the Profile of above C Code
The above valgrind invocation produced a profile in callgrind.out.PID, where PID is the process ID of the process under test. Below is the header of the profile, annotated with a description of the purpose of each line.
# version of the Callgrind format version: 1 # what created this profile, informational only creator: 3.4.1-Debian # PID of the process under test pid: 5751 # the full CLI invocation that started the process under test cmd: ./callgrind_test # used when multiple dumps are produced, starts from 1 and increments part: 1 # parameters used for the cache simulator; not relevant for function profiling desc: I1 cache: desc: D1 cache: desc: L2 cache: # not really sure desc: Timerange: Basic block 0 - 7000029 # Why did the process under test stop? (informational only) desc: Trigger: Program termination # (optional) what do positions mean? # "line" - (default) positions are line numbers in a file # "instr" - positions are offsets in a binary representing instructions positions: line # (required) event types logged by this file # "Ir" - instruction read access # "I1mr" - instruction L1 read cache miss # "I2mr" - instruction L2 read cache miss events: Ir # (recommended) the total number of events covered by this profile # This is intended to allow postprocessing tools to know the total cost in advance. # "totals" appears at the end of the file and has the same value. summary: 21000073
Annotated Body from the Profile of above C Code
# "ob" means object file, i.e. the ELF executable where the next set of costs occur ob=/usr/src/parrot/parrot-svn-committable/callgrind_test # source file responsible for the next set of costs fl=/usr/src/parrot/parrot-svn-committable/callgrind_test.c # name of the function responsible for the next set of costs is "func2" fn=func2 # on line 21 of the current file, 6 cpu cycles passed 21 6 # on line 23, 6008 cpu cycles passed 23 6008 # on line 24 (before calling), 2 cpu cycles passed 24 2 # the function (in the same file) "func1" was called cfn=func1 # 'func1' was called nonrecursively from "func2" 2 times during profiling. Calling the function cost 16 cpu cycles. calls=2 16 # on line 24, the function itself cost 6018 cpu cycles 24 6018 # on line 25, 4 cpu cycles passed 25 4 # the next costs apply to the function "main" fn=main # on line 9 of the current file (callgrind_test.c), 3 cpu cycles passed 9 3 # on line 10, 1 cycle passed before... 10 1 # the function (in the same file) "func1" was called cfn=func1 # the function was called 1 time; calling the function cost 16 cycles calls=1 16 # the call happened on line 10 and cost 3009 cycles 10 3009 # on line 11, 1 cycle passed before... 11 1 # the function (in the same file) "func2" was called cfn=func2 # the call happened once; calling the function cost 21 cycles calls=1 21 # the call happened on line 11; the function cost 6019 cycles 11 6019 # on line 12, 1 cycle passed before... 12 1 # the function "func3" was called cfn=func3 # it was called once and the call cost 27 cycles calls=1 27 #on line 12 (the function call), 12039 cycles passed 12 12039 #now we're talking about "func1" fn=func1 # on line 16 of this function, 12 cpu cycles passed over the course of the process 16 12 # on line 18, 12016 cycles passed 18 12016 # on line 19, 8 cycles passed 19 8 # now we're talking about "func3" fn=func3 # on line 27, 3 cycles passed 27 3 # on line 29, 3004 cycles passed 29 3004 # on line 30, 1 cycle passed... 30 1 # "func2" was called cfn=func2 # it was called once with the call costing 21 cycles calls=1 21 # the function cost 6019 cycles 30 6019 # on line 31, 1 cycle passed... 31 1 # "func1" was called cfn=func1 # it was called once and the call cost 16 cycles calls=1 16 # the function was called on line 31 and cost 3009 cycles 31 3009 # etc 32 2 # 21073 cycles passed during the part of the process under test that was instrumented totals: 21073