opencl - How to timer the NVIDIA SDK examples? -
i tried timer oclvectoradd example. use clgetprofilinginfo, gpu timer record time spent on kernel execution. time caculated in milliseconds. output weird. code , output below:
cl_ulong start,end; cl_event event_ker_x; cierr1 = clenqueuendrangekernel(cqcommandqueue, ckkernel, 1, null, &szglobalworksize, &szlocalworksize, 0, null, &event_ker_x); shrlog("clenqueuendrangekernel (vectoradd)...\n"); if (cierr1 != cl_success) { shrlog("error in clenqueuendrangekernel, line %u in file %s !!!\n\n", __line__, __file__); cleanup(argc, argv, exit_failure); } clgeteventprofilinginfo(event_ker_x, cl_profiling_command_start, sizeof(cl_ulong), &start, null); clgeteventprofilinginfo(event_ker_x, cl_profiling_command_end, sizeof(cl_ulong), &end, null); float ker_x_time= (end-start) * 1.0e-6f; shrlog("kernel execution time : %f\n", ker_x_time); clenqueuendrangekernel (vectoradd)... kernel execution time : 18446744027136.000000 clenqueuereadbuffer (dst)...
it looks have similar problem person: timed interval evaluates zero
in opencl, clenqueuendrangekernel queues kernel run doesn't execute kernel right away. profile kernel events, try checking execution time after clenqueuereadbuffer or add clfinish(..) after clenqueuendkernelrange.
Comments
Post a Comment