Profiling
To improve a program by reducing the time to run it is called "tuning". To do tuning, it is necessary to measure the performance of applications. To analyze the performance of applications is called "profile" or "profiling", and tools for profiling is called "profiler". Figuring out the most time consuming parts and speed up that parts gives you the most efficient tuning effect.
Timing
There are following ways to easily measure the execution time for applications. Here, we only consider serial programs. For parallel programs, you must be careful if you are measuring the elapse time or the summation of the times that cores spent.
- Using the time command
- Embedding the time measurement functions in your program
- Using profilers
Using the time command
In the UNIX/Linux environment, you can use the time command to measure the execution time. The time command measures the execution time of the program which is given as an argument of the time command.
time ./a.out
real ??m??s
user ??m??s
sys ??m??s
"real" is the time between the launch and the end, "user" is the user CPU time, "sys" is the system CPU time. The execution time of the program correspond to "user". Note that there are different versions of the time command, and the output may differ by the versions.
Time measurement in C
There are functions related to date and time in the standard library defined by the ANSI standard. To use these function, the header file "time.h" must be included in your C program. Following functions and constants are defined in time.h.
clock_t clock(void)
- returns the processor time which the program spend from the launch. (It returns -1 if it is not available.)
CLOCKS_PER_SEC
-
defines the unit of the return value of
clock
.clock()/CLOCKS_PER_SEC
gives the processor time in second. clock_t
-
defines the type of the return value of
clock
. It is, normally, long.
Following is a sample program. To measure the execution time for some work, you should call clock
before and after the work. Difference between the two calls gives the elapse time.
#include <time.h>
int main() {
clock_t time_start, time_end;
time_start = clock()/CLOCKS_PER_SEC;
some work
time_end = clock()/CLOCKS_PER_SEC;
printf(“elapse time in second = %d\n”,time_end-time_start);
}
Time measurement in Fortran
You can use the subroutine cpu_time
, which is standardized in Fortran 95.
subroutine cpu_time(time)
real, intent(inout) :: time
cpu_time
returns the processor time (in second) as an argument. Following is a sample program equivalent to the sample in C.
program time_some_code
real :: time_start, time_end
call cpu_time(time_start)
some work
call cpu_time(time_end)
print *, “elapse time in second = “, time_end - time_start
end program time_some_code
Time measurement in Matlab
The function cputime
returns the total CPU time (in second) from the launch of the application. (Note that it is recommended to use the timeit function or tic/toc for performance measurement.)
t = cputime;
some work
e = cputime - t
Profiler: GNU gprof
GNU gprof is a simple profiling tool included in GNU binutils, and is available in Linux environments. For the detail of the usage, consult the manual. A basic usage can be following: compile the program with the -pg option, and run the program. Then the profiling information is recorded in gmon.out after the execution of the program. Run the gprof command with two arguments, the program name and the profiling information file (gmon.out), to analyze the program.
gcc -pg -o a.out sample.c
./a.out (gmon.out is created)
gprof ./a.out gmon.out
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
98.59 1.40 1.40 1 1.40 1.40 gauss_elim
1.41 1.42 0.02 randf
0.00 1.42 0.00 1 0.00 0.00 init_matrix
0.00 1.42 0.00 1 0.00 0.00 init_problem_random
0.00 1.42 0.00 1 0.00 0.00 init_vec
<snip>
For the defail of how to see the output, consult the instruction displayed with the output or the manual.
Profiler: Instruments
Instruments is included in Xcode, the integrated development environment for MacOS. Launch Instruments from the Xcode menu by following Xcode→Open Developer Tools→Instruments, and select Time Profiler from the template selection. Consult Help for usage. (It may be self-instructive because of GUI.)