Deniz Yuret
2015-07-31 01:38:24 UTC
Here is a parallel program:
M = [rand(1000,1000) for i=1:16]
@time pmap(svd, M)
Here are timing results for local workers on a 16 core machine1:
julia -p 2: 14.98 secs
julia -p 4: 16.02 secs
julia -p 8: 17.64 secs
Here are timing results for machine1 connecting to remote workers on same
type of machine2:
julia --machinefile <2 copies of machine2>: 11.75 secs
julia --machinefile <4 copies of machine2>: 7.54 secs
julia --machinefile <8 copies of machine2>: 6.46 secs
At first I thought things got messed up if the master and the slaves were
on the same machine.
But it turns out the difference is between -p <n> vs. --machinefile. If I
rerun the same test on
a single machine, but use --machinefile instead of -p n:
julia --machinefile <2 copies of machine1>: 8.41 secs
julia --machinefile <4 copies of machine1>: 4.70 secs
julia --machinefile <8 copies of machine1>: 3.31 secs
I am using Julia Version 0.3.9 (2015-05-30 11:24 UTC).
Why is -p n messed up?
thanks,
deniz
M = [rand(1000,1000) for i=1:16]
@time pmap(svd, M)
Here are timing results for local workers on a 16 core machine1:
julia -p 2: 14.98 secs
julia -p 4: 16.02 secs
julia -p 8: 17.64 secs
Here are timing results for machine1 connecting to remote workers on same
type of machine2:
julia --machinefile <2 copies of machine2>: 11.75 secs
julia --machinefile <4 copies of machine2>: 7.54 secs
julia --machinefile <8 copies of machine2>: 6.46 secs
At first I thought things got messed up if the master and the slaves were
on the same machine.
But it turns out the difference is between -p <n> vs. --machinefile. If I
rerun the same test on
a single machine, but use --machinefile instead of -p n:
julia --machinefile <2 copies of machine1>: 8.41 secs
julia --machinefile <4 copies of machine1>: 4.70 secs
julia --machinefile <8 copies of machine1>: 3.31 secs
I am using Julia Version 0.3.9 (2015-05-30 11:24 UTC).
Why is -p n messed up?
thanks,
deniz