[julia-users] efficient use of shared arrays and @parallel for

Discussion:

thr

2015-07-31 00:51:24 UTC

Hi all,

I'm implementing a basic explicit advection algorithm of the form:

for t = 1:T-1
for j = 3:n-2
for i = 3:m-2
q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t])
end
end
end

where q is a quantity and u a velocity field.
I'd like to parallelize this by using sharded arrays and @parallel for, I
tried the following:

const n = 500
const m = 500
const T = 500

@everywhere function timestep(x,y)
#return x+y
return x+y +x+y +x+y +x+y +x+y +x+y +x+y
end

function advection_ser(q, u)
println("==============serial=================$n x $m x $T")
for t = 1:T-1
for j = 3:n-2
for i = 3:m-2
q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t])
end
end
end
return q
end

function advection_par(q,u)
println("==============parallel=================$n x $m x $T")
for t = 1:T-1
@sync @parallel for j = 3:n-2
for i = 3:m-2
q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t])
end
end
end
return q
end

q = SharedArray(Float64, (m,n,T), init=false)
u = SharedArray(Float64, (m,n,T), init=false)

@time qs = advection_ser(q,u)
@time qp = advection_par(q,u)

But this yields only a very moderate speed gain: the parallel version is
about 1/3 faster than the serial version for m,n,T=500,500,500 and -p 4.
Is there a way I can improve on this?

I have also seen some weird behaviour regarding shared arrays and I'd like
to verify that I'm not just doing it wrong before opening issues:

1. When I construct q inside of the advection function, @code_warntype
tells me that it's handled as an 'any' and the code is much slower.
However, typeof(q) tells me it's of type SharedArray{Float64,3} as it
should be.

2. I'm pretty sure there's a memory hole associated with SharedArrays, for
when I start above program over and over eventually I get a bus error and
julia crashes. Do I have to somehow release the shared memory from the
workers?

Thanks in advance, Johannes

thr

2015-07-31 01:57:28 UTC

Permalink

I also noticed a lot more Any-type warnings in the parallel version when
run with @code_warntype. I tried to annotate the types almost everywhere,
it didn't help.

Tim Holy

2015-07-31 12:02:20 UTC

Permalink

I have a demo to answer your question, but I'd like to simply add it to the
documentation on SharedArrays. May I use some of your code in writing up the
demo? (MIT license, see
https://github.com/JuliaLang/julia/blob/master/CONTRIBUTING.md#improving-documentation)

--Tim

Post by thr
Hi all,
for t = 1:T-1
for j = 3:n-2
for i = 3:m-2
q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t])
end
end
end
where q is a quantity and u a velocity field.
const n = 500
const m = 500
const T = 500
@everywhere function timestep(x,y)
#return x+y
return x+y +x+y +x+y +x+y +x+y +x+y +x+y
end
function advection_ser(q, u)
println("==============serial=================$n x $m x $T")
for t = 1:T-1
for j = 3:n-2
for i = 3:m-2
q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t])
end
end
end
return q
end
function advection_par(q,u)
println("==============parallel=================$n x $m x $T")
for t = 1:T-1
@sync @parallel for j = 3:n-2
for i = 3:m-2
q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t])
end
end
end
return q
end
q = SharedArray(Float64, (m,n,T), init=false)
u = SharedArray(Float64, (m,n,T), init=false)
@time qs = advection_ser(q,u)
@time qp = advection_par(q,u)
But this yields only a very moderate speed gain: the parallel version is
about 1/3 faster than the serial version for m,n,T=500,500,500 and -p 4.
Is there a way I can improve on this?
I have also seen some weird behaviour regarding shared arrays and I'd like
tells me that it's handled as an 'any' and the code is much slower.
However, typeof(q) tells me it's of type SharedArray{Float64,3} as it
should be.
2. I'm pretty sure there's a memory hole associated with SharedArrays, for
when I start above program over and over eventually I get a bus error and
julia crashes. Do I have to somehow release the shared memory from the
workers?
Thanks in advance, Johannes

thr

2015-07-31 12:47:52 UTC

Permalink

Yes, sure.

Post by Tim Holy
I have a demo to answer your question, but I'd like to simply add it to the
documentation on SharedArrays. May I use some of your code in writing up the
demo? (MIT license, see
https://github.com/JuliaLang/julia/blob/master/CONTRIBUTING.md#improving-documentation)
--Tim

Post by thr
Hi all,
for t = 1:T-1
for j = 3:n-2
for i = 3:m-2
q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t])
end
end
end
where q is a quantity and u a velocity field.

Post by thr
const n = 500
const m = 500
const T = 500
@everywhere function timestep(x,y)
#return x+y
return x+y +x+y +x+y +x+y +x+y +x+y +x+y
end
function advection_ser(q, u)
println("==============serial=================$n x $m x $T")
for t = 1:T-1
for j = 3:n-2
for i = 3:m-2
q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t])
end
end
end
return q
end
function advection_par(q,u)
println("==============parallel=================$n x $m x $T")
for t = 1:T-1
@sync @parallel for j = 3:n-2
for i = 3:m-2
q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t])
end
end
end
return q
end
q = SharedArray(Float64, (m,n,T), init=false)
u = SharedArray(Float64, (m,n,T), init=false)
@time qs = advection_ser(q,u)
@time qp = advection_par(q,u)
But this yields only a very moderate speed gain: the parallel version is
about 1/3 faster than the serial version for m,n,T=500,500,500 and -p 4.
Is there a way I can improve on this?
I have also seen some weird behaviour regarding shared arrays and I'd

Post by thr
tells me that it's handled as an 'any' and the code is much slower.
However, typeof(q) tells me it's of type SharedArray{Float64,3} as it
should be.
2. I'm pretty sure there's a memory hole associated with SharedArrays,

for

Post by thr
when I start above program over and over eventually I get a bus error

and

Post by thr
julia crashes. Do I have to somehow release the shared memory from the
workers?
Thanks in advance, Johannes

Eduardo Lenz

2015-07-31 14:22:23 UTC

Permalink

Hi.

Regarding your questions, I am also having the same problems and no luck
with those {Any} and SharedArrays. Also,
I am having the same problems with those crashes, so I have to restart
Julia after finishing any parallel code.

I am looking forward to see Tim's comments about those type instabilities.

Thanks.

Post by thr
Yes, sure.

Post by thr
Hi all,
for t = 1:T-1
for j = 3:n-2
for i = 3:m-2
q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t])
end
end
end
where q is a quantity and u a velocity field.

Post by thr
about 1/3 faster than the serial version for m,n,T=500,500,500 and -p

Post by thr
Is there a way I can improve on this?
I have also seen some weird behaviour regarding shared arrays and I'd

for

Post by thr
when I start above program over and over eventually I get a bus error

and

Post by thr
julia crashes. Do I have to somehow release the shared memory from the
workers?
Thanks in advance, Johannes

Continue reading on narkive:

Search results for '[julia-users] efficient use of shared arrays and @parallel for' (Questions and Answers)

replies

how long has rice been around?

started 2006-05-29 16:35:23 UTC

ethnic cuisine

1.16k

replies

How would you spend $50,000 to create a more sustainable environment in Australia?

started 2007-05-13 23:16:01 UTC

green living

replies

What do you mean by the word COMPUTER?

started 2006-07-13 01:16:10 UTC

computers & internet

replies

why is sociology so bullshit,I mean they have a fractured and distorted world view?

started 2007-05-11 06:55:11 UTC

sociology

replies

What computer language should I use for making 3d simulations?

started 2012-01-09 05:54:01 UTC

programming & design