c++ - Advanced issues with GPU thread divergence -


my situation - have dynamic programming algorithm implement on gpus using opencl part of phd studies. gpus working include amd hd 7970, 7750, a10-5800k apu , nvidia gtx 680. understand principles involved , of best practices necessary obtaining performance.

my program contains 4 nested loops , in data-parallel formulation able unfold 2 of outer loops. due nature of problem inner-most loop cannot without causing divergence. output table represents schedules of jobs on machines (computer science).

when threads diverge (work-items in wavefront take different routes) wrong values, looks if work-items repeat themselves. example,

t = 0, 1, 2, 3, 4, ... 63, 64, 65, 66, 67, ...
m1 0, 0, 0, 9, 9, ... 9, 0, 0, 0, 9, ...

above work-group size 64. first values t=63 correct notice how repeats again @ t=64! shouldn't zeros. here each work-item mapped time t.

if fix parameter causes divergence table gets filled expected (wrong) results, no gaps (zeros), value 9 t=0 tmax, tmax multiple of 64.

question - thread divergence have tendency of resulting in wrong calculations or undefined thread behavior?

i have dug internet, documentations, books on can find thread divergence , memory consistency. have implemented whole program in different ways including 1 calls kernel multiple times rule out global memory inconsistency results same.

any input appreciated. thanks!

after further investigation, i'm ashamed admit here, 1 of computation conditions giving wrong values , looked work-items acting strange weren't. problem corrected. thanks!


Comments

Popular posts from this blog

curl - PHP fsockopen help required -

HTTP/1.0 407 Proxy Authentication Required PHP -

c# - Resource not found error -