opencl - How to timer the NVIDIA SDK examples? -
i tried timer oclvectoradd example. use clgetprofilinginfo, gpu timer record time spent on kernel execution. time caculated in milliseconds. output weird. code , output below:
cl_ulong start,end; cl_event event_ker_x; cierr1 = clenqueuendrangekernel(cqcommandqueue, ckkernel, 1, null, &szglobalworksize, &szlocalworksize, 0, null, &event_ker_x); shrlog("clenqueuendrangekernel (vectoradd)...\n"); if (cierr1 != cl_success) { shrlog("error in clenqueuendrangekernel, line %u in file %s !!!\n\n", __line__, __file__); cleanup(argc, argv, exit_failure); } clgeteventprofilinginfo(event_ker_x, cl_profiling_command_start, sizeof(cl_ulong), &start, null); clgeteventprofilinginfo(event_ker_x, cl_profiling_command_end, sizeof(cl_ulong), &end, null); float ker_x_time= (end-start) * 1.0e-6f; shrlog("kernel execution time : %f\n", ker_x_time); clenqueuendrangekernel (vectoradd)... kernel execution time : 18446744027136.000000 clenqueuereadbuffer (dst)...
it looks have similar problem person: timed interval evaluates zero
in opencl, clenqueuendrangekernel
queues kernel run doesn't execute kernel right away. profile kernel events, try checking execution time after clenqueuereadbuffer
or add clfinish(..)
after clenqueuendkernelrange
.
Comments
Post a Comment