How do I set the default number of threads for Scala 2.10 parallel collections? -
in scala before 2.10, can set parallelism in defaultforkjoinpool (as in answer scala parallel collections degree of parallelism). in scala 2.10, api no longer exists. documented can set parallelism on single collection (http://docs.scala-lang.org/overviews/parallel-collections/configuration.html) assigning tasksupport property.
however, use parallel collections on codebase , not add 2 lines every single collection instantiation. there way configure global default thread pool size somecollection.par.map(f(_))
automatically uses default number of threads?
i know question on month old, i've had same question. googling wasn't helpful , couldn't find looked halfway sane in new api.
setting -dscala.concurrent.context.maxthreads=n suggested here: set parallelism level collections in scala 2.10? seemingly had no effect @ all, i'm not sure if used correctly (i run application 'java' in environment without 'scala' installed explicitly, might cause).
i don't know why scala-people removed essential setter appropriate package object.
however, it's possible use reflection work around incomplete/weird interface:
def setparallelismglobally(numthreads: int): unit = { val parpkgobj = scala.collection.parallel.`package` val defaulttasksupportfield = parpkgobj.getclass.getdeclaredfields.find{ _.getname == "defaulttasksupport" }.get defaulttasksupportfield.setaccessible(true) defaulttasksupportfield.set( parpkgobj, new scala.collection.parallel.forkjointasksupport( new scala.concurrent.forkjoin.forkjoinpool(numthreads) ) ) }
for not familiar more obscure features of scala, here short explanation:
scala.collection.parallel.`package`
accesses package object defaulttasksupport variable (it looks java's static variable, it's member variable of package object). backticks required identifier, because package
reserved keyword. private final field want (getfield("defaulttasksupport") didn't work reason?...), tell accessible in order able modify it, , replace it's value our own forkjointasksupport.
i don't yet understand exact mechanism of creation of parallel collections, source code of combiner trait suggests value of defaulttasksupport should percolate parallel collections somehow.
notice question qualitatively of same sort older question: "i have math.random() on codebase, how can set seed fixed number debugging purposes?" (see e.g. : set seed on math.random() ). in both cases, have sort of global "static" variable implicitly use in million different places, want change it, there no setters variable => use reflection.
ugly hell, seems work fine. if need limit total number of threads, don't forget garbage collector runs on separate thread.
Comments
Post a Comment