c++ - Benchmarking code - am I doing it right? -


i want benchmark c/c++ code. want measure cpu time, wall time , cycles/byte. wrote mesurement functions have problem cycles/byte.

to cpu time wrote function getrusage() rusage_self, wall time use clock_gettime monotonic, cycles/byte use rdtsc.

i process input buffer of size, example, 1024: char buffer[1024]. how benchmark:

  1. do warm-up phase, call fun2measure(args) 1000 times:

for(int i=0; i<1000; i++) fun2measure(args);

  1. then, real-timing benchmark, wall time:

    `unsigned long i; double timetaken; double timetotal = 3.0; // process 3 seconds

    for (timetaken=(double)0, i=0; timetaken <= timetotal; timetaken = walltime(1), i++) fun2measure(args); `

  2. and cpu time (almost same):

    for (timetaken=(double)0, i=0; timetaken <= timetotal; timetaken = walltime(1), i++) fun2measure(args);

but when want cpu cycle count function, use piece of code:

`unsigned long s = cyclecount();     (timetaken=(double)0, i=0; timetaken <= timetotal; timetaken = walltime(1), i++)     {         fun2measure(args);     }     unsigned long e = cyclecount();  unsigned long s = cyclecount();     (timetaken=(double)0, i=0; timetaken <= timetotal; timetaken = cputime(1), i++)     {         fun2measure(args);     }     unsigned long e = cyclecount();` 

and then, count cycles/byte: ((e - s) / (i * inputssize);. here inputssize 1024 because length of buffer. when rise totaltime 10s ge strange results:

for 10s:

did fun2measure 1148531 times in 10.00 seconds 1024 bytes, 0 cycles/byte [cpu] did fun2measure 1000221 times in 10.00 seconds 1024 bytes, 3.000000 cycles/byte [wall] 

for 5s:

did fun2measure 578476 times in 5.00 seconds 1024 bytes, 0 cycles/byte [cpu] did fun2measure 499542 times in 5.00 seconds 1024 bytes, 7.000000 cycles/byte [wall] 

for 4s:

did fun2measure 456828 times in 4.00 seconds 1024 bytes, 4 cycles/byte [cpu] did fun2measure 396612 times in 4.00 seconds 1024 bytes, 3.000000 cycles/byte [wall] 

my questions:

  1. are results ok?
  2. why when increase time 0 cycles/byte in cpu?
  3. how can measure average time, mean, standard deviation etc statistics such benchmarking?
  4. is benchmarking method 100% ok?

cheers!

1st edit:

after changing i double:

did fun2measure 1138164.00 times in 10.00 seconds 1024 bytes, 0.410739 cycles/byte [cpu] did fun2measure 999849.00 times in 10.00 seconds 1024 bytes, 3.382036 cycles/byte [wall] 

my results seem ok. question #2 isnt question anymore:)

your cyclecount benchmark flawed includes cost walltime/cputime function calls. in general though, urge use proper profiler instead of trying reinvent wheel. performance counters give numbers can rely on. note cycles unreliable cpu not running @ fixed frequency or kernel may task switch , halt app time.

i write benchmarks such run given function n times, n being large enough such enough samples. externally apply profiler such linux perf me hard numbers reason about. repeating benchmark given time can calculate stddev/avg values, can in script runs benchmark few times , evaluates output of profiler.


Comments

Popular posts from this blog

curl - PHP fsockopen help required -

HTTP/1.0 407 Proxy Authentication Required PHP -

c# - Resource not found error -