Time difference for random number generation implementation in Java vs. C++ -


i'm writing monte carlo simulation in java involves generating lot of random integers. thinking native code faster random number generation, should write code in c++ , return output via jni. when wrote same method in c++, takes longer execute java version. here code samples:

random rand = new random(); int threshold = 5; int[] composition = {10, 10, 10, 10, 10}; (int j = 0; j < 100000000; j++) {     rand.setseed(system.nanotime());     double sum = 0;     (int = 0; < composition[0]; i++) sum += carbon(rand);     (int = 0; < composition[1]; i++) sum += hydrogen(rand);     (int = 0; < composition[2]; i++) sum += nitrogen(rand);     (int = 0; < composition[3]; i++) sum += oxygen(rand);     (int = 0; < composition[4]; i++) sum += sulfur(rand);     if (sum < threshold) {}//execute code     else {}//execute other code } 

and equivalent code in c++:

int threshold = 5; int composition [5] = {10, 10, 10, 10, 10}; (int = 0; < 100000000; i++) {     srand(time(0));     double sum = 0;     (int = 0; < composition[0]; i++) sum += carbon();     (int = 0; < composition[1]; i++) sum += hydrogen();     (int = 0; < composition[2]; i++) sum += nitrogen();     (int = 0; < composition[3]; i++) sum += oxygen();     (int = 0; < composition[4]; i++) sum += sulfur();     if (sum > threshold) {}     else {} } 

all of element methods (carbon, hydrogen, etc) generate random number , return double.

runtimes 77.471 sec java code, , 121.777 sec c++.

admittedly i'm not experienced in c++ it's possible cause badly written code.

i suspect performance issue in bodies of carbon(), hydrogen(), nitrogen(), oxygen(), , sulfur() functions. should show how produce random data.

or in if (sum < threshold) {} else {} code.

i wanted keep setting seed results not deterministic (closer being random)

since you're using result of time(0) seed you're not getting particularly random results either way.

instead of using srand() , rand() should take @ <random> library , choose engine performance/quality characteristics meed needs. if implementation supports can non-deterministic random data std::random_device (either generate seeds or engine).

additionally <random> provides pre-made distributions such std::uniform_real_distribution<double> better average programmer's method of manually computing distribution want results of rand().


okay, here's how can eliminate inner loops code , drastically speed (in java or c++).

your code:

double carbon() {   if (rand() % 10000 < 107)     return 13.0033548378;   else     return 12.0; } 

picks 1 of 2 values particular probability. presumably intended first value picked 107 times out of 10000 (although using % rand() doesn't quite give that). when run in loop , sum results in:

for (int = 0; < composition[0]; i++) sum += carbon(); 

you'll sum += x*13.0033548378 + y*12.0; x number of times random number stays under threshold , y (trials-x). happens can simulate running bunch of trials , calculating number of successes using binomial distribution, , <random> happens provide binomial distribution.

given function sum_trials()

std::minstd_rand0 eng; // global random engine  double sum_trials(int trials, double probability, double a, double b) {   std::binomial_distribution<> dist(trials, probability);   int successes = dist(eng);   return successes*a + (trials-successes)*b; } 

you can replace carbon() loop:

sum += sum_trials(composition[0], 107.0/10000.0, 13.003354378, 12.0); // carbon trials 

i don't have actual values you're using, whole loop like:

  (int = 0; < 100000000; i++) {      double sum = 0;      sum += sum_trials(composition[0], 107.0/10000.0, 13.003354378, 12.0); // carbon trials      sum += sum_trials(composition[1], 107.0/10000.0, 13.003354378, 12.0); // hydrogen trials      sum += sum_trials(composition[2], 107.0/10000.0, 13.003354378, 12.0); // nitrogen trials      sum += sum_trials(composition[3], 107.0/10000.0, 13.003354378, 12.0); // oxygen trials      sum += sum_trials(composition[4], 107.0/10000.0, 13.003354378, 12.0); // sulfur trials       if (sum > threshold) {      } else {      }    } 

now 1 thing note inside function we're constructing distributions on , on same data. can extract replacing function sum_trials() function object, construct appropriate data once before loop, , use functor repeatedly:

struct sum_trials {   std::binomial_distribution<> dist;   double a; double b; int trials;    sum_trials(int t, double p, double a, double b) : dist{t, p}, a{a}, b{b}, trials{t} {}    double operator() () {     int successes = dist(eng);     return successes * + (trials - successes) * b;   } };  int main() {   int threshold = 5;   int composition[5] = { 10, 10, 10, 10, 10 };    sum_trials carbon   = { composition[0], 107.0/10000.0, 13.003354378, 12.0};   sum_trials hydrogen = { composition[1], 107.0/10000.0, 13.003354378, 12.0};   sum_trials nitrogen = { composition[2], 107.0/10000.0, 13.003354378, 12.0};   sum_trials oxygen   = { composition[3], 107.0/10000.0, 13.003354378, 12.0};   sum_trials sulfur   = { composition[4], 107.0/10000.0, 13.003354378, 12.0};     (int = 0; < 100000000; i++) {      double sum = 0;       sum += carbon();      sum += hydrogen();      sum += nitrogen();      sum += oxygen();      sum += sulfur();       if (sum > threshold) {      } else {      }    } } 

the original version of code took system 1 minute 30 seconds. last version here takes 11 seconds.


here's functor generate oxygen sums using 2 binomial_distributions. maybe 1 of other distributions can in 1 shot don't know.

struct sum_trials2 {   std::binomial_distribution<> d1;   std::binomial_distribution<> d2;   double a; double b; double c;   int trials;   double probabilty2;    sum_trials2(int t, double p1, double p2, double a, double b, double c)     : d1{t, p1}, a{a}, b{b}, c{c}, trials{t}, probability2{p2} {}    double operator() () {     int x = d1(eng);     d2.param(std::binomial_distribution<>{trials-x, p2}.param());     int y = d2(eng);      return x*a + y*b + (trials-x-y)*c;   } };  sum_trials2 oxygen{composition[3], 17.0/1000.0, (47.0-17.0)/(1000.0-17.0), 17.9999, 16.999, 15.999}; 

you can further speed if can calculate probability sum under threshold:

int main() {   std::minstd_rand0 eng;   std::bernoulli_distribution dist(probability_sum_is_over_threshold);    (int i=0; i< 100000000; ++i) {     if (dist(eng)) {     } else {     }   } } 

unless values other elements can negative probability sum greater 5 100%. in case don't need generate random data; execute 'if' branch of code 100,000,000 times.

int main() {   (int i=0; i< 100000000; ++i) {     //execute code   } } 

Comments

Popular posts from this blog

curl - PHP fsockopen help required -

HTTP/1.0 407 Proxy Authentication Required PHP -

c# - Resource not found error -