Discussion:
efficiency of sparse array creation
Ryan Gardner
2014-04-29 23:46:22 UTC
Permalink
Creating sparse arrays seems exceptionally slow.

I can set up the non-zero data of the array relatively quickly. For
example, the following code takes about 80 seconds on one machine.


vec_len = 700000


row_ind = Uint64[]
col_ind = Uint64[]
value = Float64[]


for j = 1:700000
for k = 1:700
ind = k*50
push!(row_ind, ind)
push!(col_ind, j)
push!(value, 5.0)
end
end


but then

a = sparse(row_ind, col_ind, value, 700000, 700000)


takes more than at least about 30 minutes. (I never let it finish.)

It doesn't seem like the numbers I'm using should be that far off the
scale. Is there a more efficient way I should be doing what I'm doing? Am
I missing something and asking for something that really is impractical?

If not, I may be able to look into the sparse matrix code a little this
weekend.


The never-finishing result is the same if I try

sprand(700000, 700000, .001)

or if I try to set 700000*700 values in a sparse matrix of zeros directly.
Thanks.
Ivar Nesje
2014-04-30 07:06:17 UTC
Permalink
Sorry for pointing out a probably obvious problem, but as there are others
that might try debug this issue on their laptop, I ask how much memory do
you have? 700000*700 floats + indexes, will spend a minimum of 11 GB (if my
math is correct) and possibly more if the asymptotic storage requirement is
more than 2 Int64 + 1 Float64 per stored value.

Ivar
Post by Ryan Gardner
Creating sparse arrays seems exceptionally slow.
I can set up the non-zero data of the array relatively quickly. For
example, the following code takes about 80 seconds on one machine.
vec_len = 700000
row_ind = Uint64[]
col_ind = Uint64[]
value = Float64[]
for j = 1:700000
for k = 1:700
ind = k*50
push!(row_ind, ind)
push!(col_ind, j)
push!(value, 5.0)
end
end
but then
a = sparse(row_ind, col_ind, value, 700000, 700000)
takes more than at least about 30 minutes. (I never let it finish.)
It doesn't seem like the numbers I'm using should be that far off the
scale. Is there a more efficient way I should be doing what I'm doing? Am
I missing something and asking for something that really is impractical?
If not, I may be able to look into the sparse matrix code a little this
weekend.
The never-finishing result is the same if I try
sprand(700000, 700000, .001)
or if I try to set 700000*700 values in a sparse matrix of zeros directly.
Thanks.
Viral Shah
2014-04-30 07:26:02 UTC
Permalink
I believe the memory requirement should be 700000*700*16 (64-bit nonzeros
and row indices) + 700001*8 (64-bit column pointers) = 7.8 GB.

This can be brought down a bit by using 32-bit index values and 64-bit
floats, but then you need 5.8 GB. Finally, if you use 32-bit index values
with 32-bit floats, you can come down to 4GB. The Julia sparse matrix
implementation is quite flexible and allows you to easily do such things.


julia> s = sparse(int32(1:10), int32(1:10), 1.0);


julia> typeof(s)

SparseMatrixCSC{Float64,Int32} (constructor with 1 method)


julia> s = sparse(int32(1:10), int32(1:10), float32(1.0));


julia> typeof(s)

SparseMatrixCSC{Float32,Int32} (constructor with 1 method)


-viral
Post by Ivar Nesje
Sorry for pointing out a probably obvious problem, but as there are others
that might try debug this issue on their laptop, I ask how much memory do
you have? 700000*700 floats + indexes, will spend a minimum of 11 GB (if my
math is correct) and possibly more if the asymptotic storage requirement is
more than 2 Int64 + 1 Float64 per stored value.
Ivar
Post by Ryan Gardner
Creating sparse arrays seems exceptionally slow.
I can set up the non-zero data of the array relatively quickly. For
example, the following code takes about 80 seconds on one machine.
vec_len = 700000
row_ind = Uint64[]
col_ind = Uint64[]
value = Float64[]
for j = 1:700000
for k = 1:700
ind = k*50
push!(row_ind, ind)
push!(col_ind, j)
push!(value, 5.0)
end
end
but then
a = sparse(row_ind, col_ind, value, 700000, 700000)
takes more than at least about 30 minutes. (I never let it finish.)
It doesn't seem like the numbers I'm using should be that far off the
scale. Is there a more efficient way I should be doing what I'm doing? Am
I missing something and asking for something that really is impractical?
If not, I may be able to look into the sparse matrix code a little this
weekend.
The never-finishing result is the same if I try
sprand(700000, 700000, .001)
or if I try to set 700000*700 values in a sparse matrix of zeros
directly. Thanks.
Dominique Orban
2014-04-30 07:41:48 UTC
Permalink
Downgrading the 700,000 to 70,000 for the sake of not waiting all night,
the original implementation takes about 4.3 seconds on my laptop.
Preallocating arrays and using @inbounds brings it down to about 0.6
seconds. @simd doesn't seem to provide any further speedup. Building the
sparse matrix takes about 3.8 seconds. This may be due to conversion from
triple to csc format?!

ps: using the original size of 700,000, Julia reports a memory usage of
11.8GB.
Post by Viral Shah
I believe the memory requirement should be 700000*700*16 (64-bit nonzeros
and row indices) + 700001*8 (64-bit column pointers) = 7.8 GB.
This can be brought down a bit by using 32-bit index values and 64-bit
floats, but then you need 5.8 GB. Finally, if you use 32-bit index values
with 32-bit floats, you can come down to 4GB. The Julia sparse matrix
implementation is quite flexible and allows you to easily do such things.
julia> s = sparse(int32(1:10), int32(1:10), 1.0);
julia> typeof(s)
SparseMatrixCSC{Float64,Int32} (constructor with 1 method)
julia> s = sparse(int32(1:10), int32(1:10), float32(1.0));
julia> typeof(s)
SparseMatrixCSC{Float32,Int32} (constructor with 1 method)
-viral
Post by Ivar Nesje
Sorry for pointing out a probably obvious problem, but as there are
others that might try debug this issue on their laptop, I ask how much
memory do you have? 700000*700 floats + indexes, will spend a minimum of 11
GB (if my math is correct) and possibly more if the asymptotic storage
requirement is more than 2 Int64 + 1 Float64 per stored value.
Ivar
Post by Ryan Gardner
Creating sparse arrays seems exceptionally slow.
I can set up the non-zero data of the array relatively quickly. For
example, the following code takes about 80 seconds on one machine.
vec_len = 700000
row_ind = Uint64[]
col_ind = Uint64[]
value = Float64[]
for j = 1:700000
for k = 1:700
ind = k*50
push!(row_ind, ind)
push!(col_ind, j)
push!(value, 5.0)
end
end
but then
a = sparse(row_ind, col_ind, value, 700000, 700000)
takes more than at least about 30 minutes. (I never let it finish.)
It doesn't seem like the numbers I'm using should be that far off the
scale. Is there a more efficient way I should be doing what I'm doing? Am
I missing something and asking for something that really is impractical?
If not, I may be able to look into the sparse matrix code a little this
weekend.
The never-finishing result is the same if I try
sprand(700000, 700000, .001)
or if I try to set 700000*700 values in a sparse matrix of zeros
directly. Thanks.
Viral Shah
2014-04-30 08:10:31 UTC
Permalink
Could you post your code? Will avoid me writing the same. :-)

Was building the vectors taking all the time, or was it in building the sparse matrix from the triples? Triples to CSC conversion is an expensive operation, and we have spent a fair amount of time making it fast. Of course, there could be more opportunities at speeding the code.

Where did you use @inbounds and @simd?

-viral
ps: using the original size of 700,000, Julia reports a memory usage of 11.8GB.
I believe the memory requirement should be 700000*700*16 (64-bit nonzeros and row indices) + 700001*8 (64-bit column pointers) = 7.8 GB.
This can be brought down a bit by using 32-bit index values and 64-bit floats, but then you need 5.8 GB. Finally, if you use 32-bit index values with 32-bit floats, you can come down to 4GB. The Julia sparse matrix implementation is quite flexible and allows you to easily do such things.
julia> s = sparse(int32(1:10), int32(1:10), 1.0);
julia> typeof(s)
SparseMatrixCSC{Float64,Int32} (constructor with 1 method)
julia> s = sparse(int32(1:10), int32(1:10), float32(1.0));
julia> typeof(s)
SparseMatrixCSC{Float32,Int32} (constructor with 1 method)
-viral
Sorry for pointing out a probably obvious problem, but as there are others that might try debug this issue on their laptop, I ask how much memory do you have? 700000*700 floats + indexes, will spend a minimum of 11 GB (if my math is correct) and possibly more if the asymptotic storage requirement is more than 2 Int64 + 1 Float64 per stored value.
Ivar
Creating sparse arrays seems exceptionally slow.
I can set up the non-zero data of the array relatively quickly. For example, the following code takes about 80 seconds on one machine.
vec_len = 700000
row_ind = Uint64[]
col_ind = Uint64[]
value = Float64[]
for j = 1:700000
for k = 1:700
ind = k*50
push!(row_ind, ind)
push!(col_ind, j)
push!(value, 5.0)
end
end
but then
a = sparse(row_ind, col_ind, value, 700000, 700000)
takes more than at least about 30 minutes. (I never let it finish.)
It doesn't seem like the numbers I'm using should be that far off the scale. Is there a more efficient way I should be doing what I'm doing? Am I missing something and asking for something that really is impractical?
If not, I may be able to look into the sparse matrix code a little this weekend.
The never-finishing result is the same if I try
sprand(700000, 700000, .001)
or if I try to set 700000*700 values in a sparse matrix of zeros directly. Thanks.
Tony Kelman
2014-04-30 12:59:27 UTC
Permalink
If you're assembling the matrix in row-sorted column-major order and
there's no duplication, then you can also skip the conversion work by using
the SparseMatrixCSC constructor directly.
Post by Viral Shah
Could you post your code? Will avoid me writing the same. :-)
Was building the vectors taking all the time, or was it in building the
sparse matrix from the triples? Triples to CSC conversion is an expensive
operation, and we have spent a fair amount of time making it fast. Of
course, there could be more opportunities at speeding the code.
-viral
Post by Dominique Orban
Downgrading the 700,000 to 70,000 for the sake of not waiting all night,
the original implementation takes about 4.3 seconds on my laptop.
sparse matrix takes about 3.8 seconds. This may be due to conversion from
triple to csc format?!
Post by Dominique Orban
ps: using the original size of 700,000, Julia reports a memory usage of
11.8GB.
Post by Dominique Orban
I believe the memory requirement should be 700000*700*16 (64-bit
nonzeros and row indices) + 700001*8 (64-bit column pointers) = 7.8 GB.
Post by Dominique Orban
This can be brought down a bit by using 32-bit index values and 64-bit
floats, but then you need 5.8 GB. Finally, if you use 32-bit index values
with 32-bit floats, you can come down to 4GB. The Julia sparse matrix
implementation is quite flexible and allows you to easily do such things.
Post by Dominique Orban
julia> s = sparse(int32(1:10), int32(1:10), 1.0);
julia> typeof(s)
SparseMatrixCSC{Float64,Int32} (constructor with 1 method)
julia> s = sparse(int32(1:10), int32(1:10), float32(1.0));
julia> typeof(s)
SparseMatrixCSC{Float32,Int32} (constructor with 1 method)
-viral
Sorry for pointing out a probably obvious problem, but as there are
others that might try debug this issue on their laptop, I ask how much
memory do you have? 700000*700 floats + indexes, will spend a minimum of 11
GB (if my math is correct) and possibly more if the asymptotic storage
requirement is more than 2 Int64 + 1 Float64 per stored value.
Post by Dominique Orban
Ivar
Creating sparse arrays seems exceptionally slow.
I can set up the non-zero data of the array relatively quickly. For
example, the following code takes about 80 seconds on one machine.
Post by Dominique Orban
vec_len = 700000
row_ind = Uint64[]
col_ind = Uint64[]
value = Float64[]
for j = 1:700000
for k = 1:700
ind = k*50
push!(row_ind, ind)
push!(col_ind, j)
push!(value, 5.0)
end
end
but then
a = sparse(row_ind, col_ind, value, 700000, 700000)
takes more than at least about 30 minutes. (I never let it finish.)
It doesn't seem like the numbers I'm using should be that far off the
scale. Is there a more efficient way I should be doing what I'm doing? Am
I missing something and asking for something that really is impractical?
Post by Dominique Orban
If not, I may be able to look into the sparse matrix code a little this
weekend.
Post by Dominique Orban
The never-finishing result is the same if I try
sprand(700000, 700000, .001)
or if I try to set 700000*700 values in a sparse matrix of zeros
directly. Thanks.
Ryan Gardner
2014-04-30 15:18:08 UTC
Permalink
I've got 16GB of RAM on this machine. Largely, my question, with
admittedly little knowledge of the internal structure of the sparse arrays,
is why generating the actual SparseMatrixCSC is so much slower than
generating what is essentially another sparse matrix representation
consisting of the indices and values. (I realize that once we start
swapping, which will happen in my example, things slow down a ton, but even
the sprand I mention was slow.) Do you observe the same results? Is the
reason for the difference clear to someone else?

Thanks for all the comments. These are helpful. It had not crossed my
mind that I could control the data type of the indices.

Using the SparseMatrixCSC constructor directly would probably be very
helpful. Do you learn about that constructor from looking at source code
or do you see it somewhere else?
Post by Tony Kelman
If you're assembling the matrix in row-sorted column-major order and
there's no duplication, then you can also skip the conversion work by using
the SparseMatrixCSC constructor directly.
Post by Viral Shah
Could you post your code? Will avoid me writing the same. :-)
Was building the vectors taking all the time, or was it in building the
sparse matrix from the triples? Triples to CSC conversion is an expensive
operation, and we have spent a fair amount of time making it fast. Of
course, there could be more opportunities at speeding the code.
-viral
Post by Dominique Orban
Downgrading the 700,000 to 70,000 for the sake of not waiting all
night, the original implementation takes about 4.3 seconds on my laptop.
sparse matrix takes about 3.8 seconds. This may be due to conversion from
triple to csc format?!
Post by Dominique Orban
ps: using the original size of 700,000, Julia reports a memory usage of
11.8GB.
Post by Dominique Orban
I believe the memory requirement should be 700000*700*16 (64-bit
nonzeros and row indices) + 700001*8 (64-bit column pointers) = 7.8 GB.
Post by Dominique Orban
This can be brought down a bit by using 32-bit index values and 64-bit
floats, but then you need 5.8 GB. Finally, if you use 32-bit index values
with 32-bit floats, you can come down to 4GB. The Julia sparse matrix
implementation is quite flexible and allows you to easily do such things.
Post by Dominique Orban
julia> s = sparse(int32(1:10), int32(1:10), 1.0);
julia> typeof(s)
SparseMatrixCSC{Float64,Int32} (constructor with 1 method)
julia> s = sparse(int32(1:10), int32(1:10), float32(1.0));
julia> typeof(s)
SparseMatrixCSC{Float32,Int32} (constructor with 1 method)
-viral
Sorry for pointing out a probably obvious problem, but as there are
others that might try debug this issue on their laptop, I ask how much
memory do you have? 700000*700 floats + indexes, will spend a minimum of 11
GB (if my math is correct) and possibly more if the asymptotic storage
requirement is more than 2 Int64 + 1 Float64 per stored value.
Post by Dominique Orban
Ivar
Creating sparse arrays seems exceptionally slow.
I can set up the non-zero data of the array relatively quickly. For
example, the following code takes about 80 seconds on one machine.
Post by Dominique Orban
vec_len = 700000
row_ind = Uint64[]
col_ind = Uint64[]
value = Float64[]
for j = 1:700000
for k = 1:700
ind = k*50
push!(row_ind, ind)
push!(col_ind, j)
push!(value, 5.0)
end
end
but then
a = sparse(row_ind, col_ind, value, 700000, 700000)
takes more than at least about 30 minutes. (I never let it finish.)
It doesn't seem like the numbers I'm using should be that far off the
scale. Is there a more efficient way I should be doing what I'm doing? Am
I missing something and asking for something that really is impractical?
Post by Dominique Orban
If not, I may be able to look into the sparse matrix code a little this
weekend.
Post by Dominique Orban
The never-finishing result is the same if I try
sprand(700000, 700000, .001)
or if I try to set 700000*700 values in a sparse matrix of zeros
directly. Thanks.
Viral Shah
2014-04-30 15:38:45 UTC
Permalink
You can call SparseMatrixCSC directly, but then you have to do all the arrangement and sorting yourself. Depending on your application and how the nonzeros are generated, this may or may not help.

I will investigate this further. I now have all the information I need.

Thanks,

-viral
I've got 16GB of RAM on this machine. Largely, my question, with admittedly little knowledge of the internal structure of the sparse arrays, is why generating the actual SparseMatrixCSC is so much slower than generating what is essentially another sparse matrix representation consisting of the indices and values. (I realize that once we start swapping, which will happen in my example, things slow down a ton, but even the sprand I mention was slow.) Do you observe the same results? Is the reason for the difference clear to someone else?
Thanks for all the comments. These are helpful. It had not crossed my mind that I could control the data type of the indices.
Using the SparseMatrixCSC constructor directly would probably be very helpful. Do you learn about that constructor from looking at source code or do you see it somewhere else?
If you're assembling the matrix in row-sorted column-major order and there's no duplication, then you can also skip the conversion work by using the SparseMatrixCSC constructor directly.
Could you post your code? Will avoid me writing the same. :-)
Was building the vectors taking all the time, or was it in building the sparse matrix from the triples? Triples to CSC conversion is an expensive operation, and we have spent a fair amount of time making it fast. Of course, there could be more opportunities at speeding the code.
-viral
ps: using the original size of 700,000, Julia reports a memory usage of 11.8GB.
I believe the memory requirement should be 700000*700*16 (64-bit nonzeros and row indices) + 700001*8 (64-bit column pointers) = 7.8 GB.
This can be brought down a bit by using 32-bit index values and 64-bit floats, but then you need 5.8 GB. Finally, if you use 32-bit index values with 32-bit floats, you can come down to 4GB. The Julia sparse matrix implementation is quite flexible and allows you to easily do such things.
julia> s = sparse(int32(1:10), int32(1:10), 1.0);
julia> typeof(s)
SparseMatrixCSC{Float64,Int32} (constructor with 1 method)
julia> s = sparse(int32(1:10), int32(1:10), float32(1.0));
julia> typeof(s)
SparseMatrixCSC{Float32,Int32} (constructor with 1 method)
-viral
Sorry for pointing out a probably obvious problem, but as there are others that might try debug this issue on their laptop, I ask how much memory do you have? 700000*700 floats + indexes, will spend a minimum of 11 GB (if my math is correct) and possibly more if the asymptotic storage requirement is more than 2 Int64 + 1 Float64 per stored value.
Ivar
Creating sparse arrays seems exceptionally slow.
I can set up the non-zero data of the array relatively quickly. For example, the following code takes about 80 seconds on one machine.
vec_len = 700000
row_ind = Uint64[]
col_ind = Uint64[]
value = Float64[]
for j = 1:700000
for k = 1:700
ind = k*50
push!(row_ind, ind)
push!(col_ind, j)
push!(value, 5.0)
end
end
but then
a = sparse(row_ind, col_ind, value, 700000, 700000)
takes more than at least about 30 minutes. (I never let it finish.)
It doesn't seem like the numbers I'm using should be that far off the scale. Is there a more efficient way I should be doing what I'm doing? Am I missing something and asking for something that really is impractical?
If not, I may be able to look into the sparse matrix code a little this weekend.
The never-finishing result is the same if I try
sprand(700000, 700000, .001)
or if I try to set 700000*700 values in a sparse matrix of zeros directly. Thanks.
Viral Shah
2014-04-30 15:40:20 UTC
Permalink
Octave 3.6 just gave up:

octave:1> tic; sprand(700000, 700000, .001); toc;
error: memory exhausted or requested size too large for range of Octave's index type -- trying to return to prompt


-viral
Post by Viral Shah
You can call SparseMatrixCSC directly, but then you have to do all the arrangement and sorting yourself. Depending on your application and how the nonzeros are generated, this may or may not help.
I will investigate this further. I now have all the information I need.
Thanks,
-viral
I've got 16GB of RAM on this machine. Largely, my question, with admittedly little knowledge of the internal structure of the sparse arrays, is why generating the actual SparseMatrixCSC is so much slower than generating what is essentially another sparse matrix representation consisting of the indices and values. (I realize that once we start swapping, which will happen in my example, things slow down a ton, but even the sprand I mention was slow.) Do you observe the same results? Is the reason for the difference clear to someone else?
Thanks for all the comments. These are helpful. It had not crossed my mind that I could control the data type of the indices.
Using the SparseMatrixCSC constructor directly would probably be very helpful. Do you learn about that constructor from looking at source code or do you see it somewhere else?
If you're assembling the matrix in row-sorted column-major order and there's no duplication, then you can also skip the conversion work by using the SparseMatrixCSC constructor directly.
Could you post your code? Will avoid me writing the same. :-)
Was building the vectors taking all the time, or was it in building the sparse matrix from the triples? Triples to CSC conversion is an expensive operation, and we have spent a fair amount of time making it fast. Of course, there could be more opportunities at speeding the code.
-viral
ps: using the original size of 700,000, Julia reports a memory usage of 11.8GB.
I believe the memory requirement should be 700000*700*16 (64-bit nonzeros and row indices) + 700001*8 (64-bit column pointers) = 7.8 GB.
This can be brought down a bit by using 32-bit index values and 64-bit floats, but then you need 5.8 GB. Finally, if you use 32-bit index values with 32-bit floats, you can come down to 4GB. The Julia sparse matrix implementation is quite flexible and allows you to easily do such things.
julia> s = sparse(int32(1:10), int32(1:10), 1.0);
julia> typeof(s)
SparseMatrixCSC{Float64,Int32} (constructor with 1 method)
julia> s = sparse(int32(1:10), int32(1:10), float32(1.0));
julia> typeof(s)
SparseMatrixCSC{Float32,Int32} (constructor with 1 method)
-viral
Sorry for pointing out a probably obvious problem, but as there are others that might try debug this issue on their laptop, I ask how much memory do you have? 700000*700 floats + indexes, will spend a minimum of 11 GB (if my math is correct) and possibly more if the asymptotic storage requirement is more than 2 Int64 + 1 Float64 per stored value.
Ivar
Creating sparse arrays seems exceptionally slow.
I can set up the non-zero data of the array relatively quickly. For example, the following code takes about 80 seconds on one machine.
vec_len = 700000
row_ind = Uint64[]
col_ind = Uint64[]
value = Float64[]
for j = 1:700000
for k = 1:700
ind = k*50
push!(row_ind, ind)
push!(col_ind, j)
push!(value, 5.0)
end
end
but then
a = sparse(row_ind, col_ind, value, 700000, 700000)
takes more than at least about 30 minutes. (I never let it finish.)
It doesn't seem like the numbers I'm using should be that far off the scale. Is there a more efficient way I should be doing what I'm doing? Am I missing something and asking for something that really is impractical?
If not, I may be able to look into the sparse matrix code a little this weekend.
The never-finishing result is the same if I try
sprand(700000, 700000, .001)
or if I try to set 700000*700 values in a sparse matrix of zeros directly. Thanks.
Dominique Orban
2014-04-30 16:31:45 UTC
Permalink
Sorry, here's my code: https://gist.github.com/11431891

I don't see how to use SparseMatrixCSC directly. Doesn't it require that
the arrays already represent the CSC structure?
Post by Viral Shah
octave:1> tic; sprand(700000, 700000, .001); toc;
error: memory exhausted or requested size too large for range of Octave's
index type -- trying to return to prompt
-viral
Post by Viral Shah
You can call SparseMatrixCSC directly, but then you have to do all the
arrangement and sorting yourself. Depending on your application and how the
nonzeros are generated, this may or may not help.
Post by Viral Shah
I will investigate this further. I now have all the information I need.
Thanks,
-viral
Post by Ryan Gardner
I've got 16GB of RAM on this machine. Largely, my question, with
admittedly little knowledge of the internal structure of the sparse arrays,
is why generating the actual SparseMatrixCSC is so much slower than
generating what is essentially another sparse matrix representation
consisting of the indices and values. (I realize that once we start
swapping, which will happen in my example, things slow down a ton, but even
the sprand I mention was slow.) Do you observe the same results? Is the
reason for the difference clear to someone else?
Post by Viral Shah
Post by Ryan Gardner
Thanks for all the comments. These are helpful. It had not crossed my
mind that I could control the data type of the indices.
Post by Viral Shah
Post by Ryan Gardner
Using the SparseMatrixCSC constructor directly would probably be very
helpful. Do you learn about that constructor from looking at source code
or do you see it somewhere else?
Post by Viral Shah
Post by Ryan Gardner
If you're assembling the matrix in row-sorted column-major order and
there's no duplication, then you can also skip the conversion work by using
the SparseMatrixCSC constructor directly.
Post by Viral Shah
Post by Ryan Gardner
Could you post your code? Will avoid me writing the same. :-)
Was building the vectors taking all the time, or was it in building the
sparse matrix from the triples? Triples to CSC conversion is an expensive
operation, and we have spent a fair amount of time making it fast. Of
course, there could be more opportunities at speeding the code.
Post by Viral Shah
Post by Ryan Gardner
-viral
Post by Dominique Orban
Downgrading the 700,000 to 70,000 for the sake of not waiting all
night, the original implementation takes about 4.3 seconds on my laptop.
sparse matrix takes about 3.8 seconds. This may be due to conversion from
triple to csc format?!
Post by Viral Shah
Post by Ryan Gardner
Post by Dominique Orban
ps: using the original size of 700,000, Julia reports a memory usage
of 11.8GB.
Post by Viral Shah
Post by Ryan Gardner
Post by Dominique Orban
I believe the memory requirement should be 700000*700*16 (64-bit
nonzeros and row indices) + 700001*8 (64-bit column pointers) = 7.8 GB.
Post by Viral Shah
Post by Ryan Gardner
Post by Dominique Orban
This can be brought down a bit by using 32-bit index values and 64-bit
floats, but then you need 5.8 GB. Finally, if you use 32-bit index values
with 32-bit floats, you can come down to 4GB. The Julia sparse matrix
implementation is quite flexible and allows you to easily do such things.
Post by Viral Shah
Post by Ryan Gardner
Post by Dominique Orban
julia> s = sparse(int32(1:10), int32(1:10), 1.0);
julia> typeof(s)
SparseMatrixCSC{Float64,Int32} (constructor with 1 method)
julia> s = sparse(int32(1:10), int32(1:10), float32(1.0));
julia> typeof(s)
SparseMatrixCSC{Float32,Int32} (constructor with 1 method)
-viral
Sorry for pointing out a probably obvious problem, but as there are
others that might try debug this issue on their laptop, I ask how much
memory do you have? 700000*700 floats + indexes, will spend a minimum of 11
GB (if my math is correct) and possibly more if the asymptotic storage
requirement is more than 2 Int64 + 1 Float64 per stored value.
Post by Viral Shah
Post by Ryan Gardner
Post by Dominique Orban
Ivar
Creating sparse arrays seems exceptionally slow.
I can set up the non-zero data of the array relatively quickly. For
example, the following code takes about 80 seconds on one machine.
Post by Viral Shah
Post by Ryan Gardner
Post by Dominique Orban
vec_len = 700000
row_ind = Uint64[]
col_ind = Uint64[]
value = Float64[]
for j = 1:700000
for k = 1:700
ind = k*50
push!(row_ind, ind)
push!(col_ind, j)
push!(value, 5.0)
end
end
but then
a = sparse(row_ind, col_ind, value, 700000, 700000)
takes more than at least about 30 minutes. (I never let it finish.)
It doesn't seem like the numbers I'm using should be that far off the
scale. Is there a more efficient way I should be doing what I'm doing? Am
I missing something and asking for something that really is impractical?
Post by Viral Shah
Post by Ryan Gardner
Post by Dominique Orban
If not, I may be able to look into the sparse matrix code a little
this weekend.
Post by Viral Shah
Post by Ryan Gardner
Post by Dominique Orban
The never-finishing result is the same if I try
sprand(700000, 700000, .001)
or if I try to set 700000*700 values in a sparse matrix of zeros
directly. Thanks.
Tony Kelman
2014-04-30 16:44:45 UTC
Permalink
The SparseMatrixCSC constructor currently has the signature
SparseMatrixCSC(m, n, colptr, rowval, nzval)
It looks as though this isn't formally documented, but it's a pretty clear
implementation if you understand the basics of the CSC format (and remember
that all the indexing is 1-based if you've worked with sparse matrices a
lot in C or Python).

In this particular case, you are constructing things in the proper order so
you can easily assemble colptr and don't need to rearrange rowval or nzval
at all.

Note that there was a recently opened pull
request https://github.com/JuliaLang/julia/pull/6696 that could change some
of this if it gets merged, so maybe don't get too attached to any use of
this constructor.
Post by Dominique Orban
Sorry, here's my code: https://gist.github.com/11431891
I don't see how to use SparseMatrixCSC directly. Doesn't it require that
the arrays already represent the CSC structure?
Post by Viral Shah
octave:1> tic; sprand(700000, 700000, .001); toc;
error: memory exhausted or requested size too large for range of Octave's
index type -- trying to return to prompt
-viral
Post by Viral Shah
You can call SparseMatrixCSC directly, but then you have to do all the
arrangement and sorting yourself. Depending on your application and how the
nonzeros are generated, this may or may not help.
Post by Viral Shah
I will investigate this further. I now have all the information I need.
Thanks,
-viral
Post by Ryan Gardner
I've got 16GB of RAM on this machine. Largely, my question, with
admittedly little knowledge of the internal structure of the sparse arrays,
is why generating the actual SparseMatrixCSC is so much slower than
generating what is essentially another sparse matrix representation
consisting of the indices and values. (I realize that once we start
swapping, which will happen in my example, things slow down a ton, but even
the sprand I mention was slow.) Do you observe the same results? Is the
reason for the difference clear to someone else?
Post by Viral Shah
Post by Ryan Gardner
Thanks for all the comments. These are helpful. It had not crossed
my mind that I could control the data type of the indices.
Post by Viral Shah
Post by Ryan Gardner
Using the SparseMatrixCSC constructor directly would probably be very
helpful. Do you learn about that constructor from looking at source code
or do you see it somewhere else?
Post by Viral Shah
Post by Ryan Gardner
If you're assembling the matrix in row-sorted column-major order and
there's no duplication, then you can also skip the conversion work by using
the SparseMatrixCSC constructor directly.
Post by Viral Shah
Post by Ryan Gardner
Could you post your code? Will avoid me writing the same. :-)
Was building the vectors taking all the time, or was it in building
the sparse matrix from the triples? Triples to CSC conversion is an
expensive operation, and we have spent a fair amount of time making it
fast. Of course, there could be more opportunities at speeding the code.
Post by Viral Shah
Post by Ryan Gardner
-viral
Post by Dominique Orban
Downgrading the 700,000 to 70,000 for the sake of not waiting all
night, the original implementation takes about 4.3 seconds on my laptop.
sparse matrix takes about 3.8 seconds. This may be due to conversion from
triple to csc format?!
Post by Viral Shah
Post by Ryan Gardner
Post by Dominique Orban
ps: using the original size of 700,000, Julia reports a memory usage
of 11.8GB.
Post by Viral Shah
Post by Ryan Gardner
Post by Dominique Orban
I believe the memory requirement should be 700000*700*16 (64-bit
nonzeros and row indices) + 700001*8 (64-bit column pointers) = 7.8 GB.
Post by Viral Shah
Post by Ryan Gardner
Post by Dominique Orban
This can be brought down a bit by using 32-bit index values and
64-bit floats, but then you need 5.8 GB. Finally, if you use 32-bit index
values with 32-bit floats, you can come down to 4GB. The Julia sparse
matrix implementation is quite flexible and allows you to easily do such
things.
Post by Viral Shah
Post by Ryan Gardner
Post by Dominique Orban
julia> s = sparse(int32(1:10), int32(1:10), 1.0);
julia> typeof(s)
SparseMatrixCSC{Float64,Int32} (constructor with 1 method)
julia> s = sparse(int32(1:10), int32(1:10), float32(1.0));
julia> typeof(s)
SparseMatrixCSC{Float32,Int32} (constructor with 1 method)
-viral
Sorry for pointing out a probably obvious problem, but as there are
others that might try debug this issue on their laptop, I ask how much
memory do you have? 700000*700 floats + indexes, will spend a minimum of 11
GB (if my math is correct) and possibly more if the asymptotic storage
requirement is more than 2 Int64 + 1 Float64 per stored value.
Post by Viral Shah
Post by Ryan Gardner
Post by Dominique Orban
Ivar
Creating sparse arrays seems exceptionally slow.
I can set up the non-zero data of the array relatively quickly. For
example, the following code takes about 80 seconds on one machine.
Post by Viral Shah
Post by Ryan Gardner
Post by Dominique Orban
vec_len = 700000
row_ind = Uint64[]
col_ind = Uint64[]
value = Float64[]
for j = 1:700000
for k = 1:700
ind = k*50
push!(row_ind, ind)
push!(col_ind, j)
push!(value, 5.0)
end
end
but then
a = sparse(row_ind, col_ind, value, 700000, 700000)
takes more than at least about 30 minutes. (I never let it finish.)
It doesn't seem like the numbers I'm using should be that far off the
scale. Is there a more efficient way I should be doing what I'm doing? Am
I missing something and asking for something that really is impractical?
Post by Viral Shah
Post by Ryan Gardner
Post by Dominique Orban
If not, I may be able to look into the sparse matrix code a little
this weekend.
Post by Viral Shah
Post by Ryan Gardner
Post by Dominique Orban
The never-finishing result is the same if I try
sprand(700000, 700000, .001)
or if I try to set 700000*700 values in a sparse matrix of zeros
directly. Thanks.
Viral Shah
2014-04-30 18:36:14 UTC
Permalink
I ran the sprand example, and it took 290 seconds on a machine with enough RAM. Given that it is creating a matrix with half a billion nonzeros, this doesn’t sound too bad.

-viral
I've got 16GB of RAM on this machine. Largely, my question, with admittedly little knowledge of the internal structure of the sparse arrays, is why generating the actual SparseMatrixCSC is so much slower than generating what is essentially another sparse matrix representation consisting of the indices and values. (I realize that once we start swapping, which will happen in my example, things slow down a ton, but even the sprand I mention was slow.) Do you observe the same results? Is the reason for the difference clear to someone else?
Thanks for all the comments. These are helpful. It had not crossed my mind that I could control the data type of the indices.
Using the SparseMatrixCSC constructor directly would probably be very helpful. Do you learn about that constructor from looking at source code or do you see it somewhere else?
If you're assembling the matrix in row-sorted column-major order and there's no duplication, then you can also skip the conversion work by using the SparseMatrixCSC constructor directly.
Could you post your code? Will avoid me writing the same. :-)
Was building the vectors taking all the time, or was it in building the sparse matrix from the triples? Triples to CSC conversion is an expensive operation, and we have spent a fair amount of time making it fast. Of course, there could be more opportunities at speeding the code.
-viral
ps: using the original size of 700,000, Julia reports a memory usage of 11.8GB.
I believe the memory requirement should be 700000*700*16 (64-bit nonzeros and row indices) + 700001*8 (64-bit column pointers) = 7.8 GB.
This can be brought down a bit by using 32-bit index values and 64-bit floats, but then you need 5.8 GB. Finally, if you use 32-bit index values with 32-bit floats, you can come down to 4GB. The Julia sparse matrix implementation is quite flexible and allows you to easily do such things.
julia> s = sparse(int32(1:10), int32(1:10), 1.0);
julia> typeof(s)
SparseMatrixCSC{Float64,Int32} (constructor with 1 method)
julia> s = sparse(int32(1:10), int32(1:10), float32(1.0));
julia> typeof(s)
SparseMatrixCSC{Float32,Int32} (constructor with 1 method)
-viral
Sorry for pointing out a probably obvious problem, but as there are others that might try debug this issue on their laptop, I ask how much memory do you have? 700000*700 floats + indexes, will spend a minimum of 11 GB (if my math is correct) and possibly more if the asymptotic storage requirement is more than 2 Int64 + 1 Float64 per stored value.
Ivar
Creating sparse arrays seems exceptionally slow.
I can set up the non-zero data of the array relatively quickly. For example, the following code takes about 80 seconds on one machine.
vec_len = 700000
row_ind = Uint64[]
col_ind = Uint64[]
value = Float64[]
for j = 1:700000
for k = 1:700
ind = k*50
push!(row_ind, ind)
push!(col_ind, j)
push!(value, 5.0)
end
end
but then
a = sparse(row_ind, col_ind, value, 700000, 700000)
takes more than at least about 30 minutes. (I never let it finish.)
It doesn't seem like the numbers I'm using should be that far off the scale. Is there a more efficient way I should be doing what I'm doing? Am I missing something and asking for something that really is impractical?
If not, I may be able to look into the sparse matrix code a little this weekend.
The never-finishing result is the same if I try
sprand(700000, 700000, .001)
or if I try to set 700000*700 values in a sparse matrix of zeros directly. Thanks.
Dominique Orban
2014-04-30 19:21:26 UTC
Permalink
Of course in this case, it's easy to build the CSC arrays directly instead
of the (row, col, val) triples. I updated my gist. The construction of the
sparse matrix using a direct call to SparseMatrixCSC now takes 2.6e-6
seconds! This is still with vec_len=70,000. Here are the timings:

elapsed time: 4.478248647 seconds (6306447880 bytes allocated) # using push!
elapsed time: 0.682337676 seconds (1176000328 bytes allocated) # using
arrays and @inbounds
elapsed time: 0.713211454 seconds (1176000328 bytes allocated) # arrays,
@inbounds and @simd
elapsed time: 4.46453756 seconds (1570811024 bytes allocated) # build
sparse()
elapsed time: 0.504590721 seconds (784560344 bytes allocated) # build CSC
arrays with @inbounds and @simd
elapsed time: 2.668e-6 seconds (96 bytes allocated) # build
SparseCSCMatrix
Post by Viral Shah
I ran the sprand example, and it took 290 seconds on a machine with enough
RAM. Given that it is creating a matrix with half a billion nonzeros, this
doesn’t sound too bad.
-viral
Post by Ryan Gardner
I've got 16GB of RAM on this machine. Largely, my question, with
admittedly little knowledge of the internal structure of the sparse arrays,
is why generating the actual SparseMatrixCSC is so much slower than
generating what is essentially another sparse matrix representation
consisting of the indices and values. (I realize that once we start
swapping, which will happen in my example, things slow down a ton, but even
the sprand I mention was slow.) Do you observe the same results? Is the
reason for the difference clear to someone else?
Post by Ryan Gardner
Thanks for all the comments. These are helpful. It had not crossed my
mind that I could control the data type of the indices.
Post by Ryan Gardner
Using the SparseMatrixCSC constructor directly would probably be very
helpful. Do you learn about that constructor from looking at source code
or do you see it somewhere else?
Post by Ryan Gardner
If you're assembling the matrix in row-sorted column-major order and
there's no duplication, then you can also skip the conversion work by using
the SparseMatrixCSC constructor directly.
Post by Ryan Gardner
Could you post your code? Will avoid me writing the same. :-)
Was building the vectors taking all the time, or was it in building the
sparse matrix from the triples? Triples to CSC conversion is an expensive
operation, and we have spent a fair amount of time making it fast. Of
course, there could be more opportunities at speeding the code.
Post by Ryan Gardner
-viral
Post by Dominique Orban
Downgrading the 700,000 to 70,000 for the sake of not waiting all
night, the original implementation takes about 4.3 seconds on my laptop.
sparse matrix takes about 3.8 seconds. This may be due to conversion from
triple to csc format?!
Post by Ryan Gardner
Post by Dominique Orban
ps: using the original size of 700,000, Julia reports a memory usage
of 11.8GB.
Post by Ryan Gardner
Post by Dominique Orban
I believe the memory requirement should be 700000*700*16 (64-bit
nonzeros and row indices) + 700001*8 (64-bit column pointers) = 7.8 GB.
Post by Ryan Gardner
Post by Dominique Orban
This can be brought down a bit by using 32-bit index values and 64-bit
floats, but then you need 5.8 GB. Finally, if you use 32-bit index values
with 32-bit floats, you can come down to 4GB. The Julia sparse matrix
implementation is quite flexible and allows you to easily do such things.
Post by Ryan Gardner
Post by Dominique Orban
julia> s = sparse(int32(1:10), int32(1:10), 1.0);
julia> typeof(s)
SparseMatrixCSC{Float64,Int32} (constructor with 1 method)
julia> s = sparse(int32(1:10), int32(1:10), float32(1.0));
julia> typeof(s)
SparseMatrixCSC{Float32,Int32} (constructor with 1 method)
-viral
Sorry for pointing out a probably obvious problem, but as there are
others that might try debug this issue on their laptop, I ask how much
memory do you have? 700000*700 floats + indexes, will spend a minimum of 11
GB (if my math is correct) and possibly more if the asymptotic storage
requirement is more than 2 Int64 + 1 Float64 per stored value.
Post by Ryan Gardner
Post by Dominique Orban
Ivar
Creating sparse arrays seems exceptionally slow.
I can set up the non-zero data of the array relatively quickly. For
example, the following code takes about 80 seconds on one machine.
Post by Ryan Gardner
Post by Dominique Orban
vec_len = 700000
row_ind = Uint64[]
col_ind = Uint64[]
value = Float64[]
for j = 1:700000
for k = 1:700
ind = k*50
push!(row_ind, ind)
push!(col_ind, j)
push!(value, 5.0)
end
end
but then
a = sparse(row_ind, col_ind, value, 700000, 700000)
takes more than at least about 30 minutes. (I never let it finish.)
It doesn't seem like the numbers I'm using should be that far off the
scale. Is there a more efficient way I should be doing what I'm doing? Am
I missing something and asking for something that really is impractical?
Post by Ryan Gardner
Post by Dominique Orban
If not, I may be able to look into the sparse matrix code a little
this weekend.
Post by Ryan Gardner
Post by Dominique Orban
The never-finishing result is the same if I try
sprand(700000, 700000, .001)
or if I try to set 700000*700 values in a sparse matrix of zeros
directly. Thanks.
Ryan Gardner
2014-04-30 19:29:42 UTC
Permalink
Hmmm. That is much better than I was getting. Thanks Viral.

Was it much faster for you to create the column-index, row-index, and value
arrays? I would still expect them to be roughly on par in terms of speed.
Post by Viral Shah
I ran the sprand example, and it took 290 seconds on a machine with enough
RAM. Given that it is creating a matrix with half a billion nonzeros, this
doesn’t sound too bad.
-viral
Post by Ryan Gardner
I've got 16GB of RAM on this machine. Largely, my question, with
admittedly little knowledge of the internal structure of the sparse arrays,
is why generating the actual SparseMatrixCSC is so much slower than
generating what is essentially another sparse matrix representation
consisting of the indices and values. (I realize that once we start
swapping, which will happen in my example, things slow down a ton, but even
the sprand I mention was slow.) Do you observe the same results? Is the
reason for the difference clear to someone else?
Post by Ryan Gardner
Thanks for all the comments. These are helpful. It had not crossed my
mind that I could control the data type of the indices.
Post by Ryan Gardner
Using the SparseMatrixCSC constructor directly would probably be very
helpful. Do you learn about that constructor from looking at source code
or do you see it somewhere else?
Post by Ryan Gardner
If you're assembling the matrix in row-sorted column-major order and
there's no duplication, then you can also skip the conversion work by using
the SparseMatrixCSC constructor directly.
Post by Ryan Gardner
Could you post your code? Will avoid me writing the same. :-)
Was building the vectors taking all the time, or was it in building the
sparse matrix from the triples? Triples to CSC conversion is an expensive
operation, and we have spent a fair amount of time making it fast. Of
course, there could be more opportunities at speeding the code.
Post by Ryan Gardner
-viral
Post by Dominique Orban
Downgrading the 700,000 to 70,000 for the sake of not waiting all
night, the original implementation takes about 4.3 seconds on my laptop.
sparse matrix takes about 3.8 seconds. This may be due to conversion from
triple to csc format?!
Post by Ryan Gardner
Post by Dominique Orban
ps: using the original size of 700,000, Julia reports a memory usage
of 11.8GB.
Post by Ryan Gardner
Post by Dominique Orban
I believe the memory requirement should be 700000*700*16 (64-bit
nonzeros and row indices) + 700001*8 (64-bit column pointers) = 7.8 GB.
Post by Ryan Gardner
Post by Dominique Orban
This can be brought down a bit by using 32-bit index values and 64-bit
floats, but then you need 5.8 GB. Finally, if you use 32-bit index values
with 32-bit floats, you can come down to 4GB. The Julia sparse matrix
implementation is quite flexible and allows you to easily do such things.
Post by Ryan Gardner
Post by Dominique Orban
julia> s = sparse(int32(1:10), int32(1:10), 1.0);
julia> typeof(s)
SparseMatrixCSC{Float64,Int32} (constructor with 1 method)
julia> s = sparse(int32(1:10), int32(1:10), float32(1.0));
julia> typeof(s)
SparseMatrixCSC{Float32,Int32} (constructor with 1 method)
-viral
Sorry for pointing out a probably obvious problem, but as there are
others that might try debug this issue on their laptop, I ask how much
memory do you have? 700000*700 floats + indexes, will spend a minimum of 11
GB (if my math is correct) and possibly more if the asymptotic storage
requirement is more than 2 Int64 + 1 Float64 per stored value.
Post by Ryan Gardner
Post by Dominique Orban
Ivar
Creating sparse arrays seems exceptionally slow.
I can set up the non-zero data of the array relatively quickly. For
example, the following code takes about 80 seconds on one machine.
Post by Ryan Gardner
Post by Dominique Orban
vec_len = 700000
row_ind = Uint64[]
col_ind = Uint64[]
value = Float64[]
for j = 1:700000
for k = 1:700
ind = k*50
push!(row_ind, ind)
push!(col_ind, j)
push!(value, 5.0)
end
end
but then
a = sparse(row_ind, col_ind, value, 700000, 700000)
takes more than at least about 30 minutes. (I never let it finish.)
It doesn't seem like the numbers I'm using should be that far off the
scale. Is there a more efficient way I should be doing what I'm doing? Am
I missing something and asking for something that really is impractical?
Post by Ryan Gardner
Post by Dominique Orban
If not, I may be able to look into the sparse matrix code a little
this weekend.
Post by Ryan Gardner
Post by Dominique Orban
The never-finishing result is the same if I try
sprand(700000, 700000, .001)
or if I try to set 700000*700 values in a sparse matrix of zeros
directly. Thanks.
Viral Shah
2014-05-01 08:52:42 UTC
Permalink
It could be because of memory usage. I have 1 TB RAM on the machine I was doing. If you were running into swap, it would certainly take much longer.

I will try the other version as soon as the machine is available for me to use (some admin issues), and also look into speeding things up if possible. If the generation of the data is in your control, you can just generate it pre-sorted or in CSC format. I just need to check if we can shortcut pre-sorted data and generate the sparse matrix quickly.

-viral
Post by Ryan Gardner
Hmmm. That is much better than I was getting. Thanks Viral.
Was it much faster for you to create the column-index, row-index, and value arrays? I would still expect them to be roughly on par in terms of speed.
I ran the sprand example, and it took 290 seconds on a machine with enough RAM. Given that it is creating a matrix with half a billion nonzeros, this doesn’t sound too bad.
-viral
I've got 16GB of RAM on this machine. Largely, my question, with admittedly little knowledge of the internal structure of the sparse arrays, is why generating the actual SparseMatrixCSC is so much slower than generating what is essentially another sparse matrix representation consisting of the indices and values. (I realize that once we start swapping, which will happen in my example, things slow down a ton, but even the sprand I mention was slow.) Do you observe the same results? Is the reason for the difference clear to someone else?
Thanks for all the comments. These are helpful. It had not crossed my mind that I could control the data type of the indices.
Using the SparseMatrixCSC constructor directly would probably be very helpful. Do you learn about that constructor from looking at source code or do you see it somewhere else?
If you're assembling the matrix in row-sorted column-major order and there's no duplication, then you can also skip the conversion work by using the SparseMatrixCSC constructor directly.
Could you post your code? Will avoid me writing the same. :-)
Was building the vectors taking all the time, or was it in building the sparse matrix from the triples? Triples to CSC conversion is an expensive operation, and we have spent a fair amount of time making it fast. Of course, there could be more opportunities at speeding the code.
-viral
ps: using the original size of 700,000, Julia reports a memory usage of 11.8GB.
I believe the memory requirement should be 700000*700*16 (64-bit nonzeros and row indices) + 700001*8 (64-bit column pointers) = 7.8 GB.
This can be brought down a bit by using 32-bit index values and 64-bit floats, but then you need 5.8 GB. Finally, if you use 32-bit index values with 32-bit floats, you can come down to 4GB. The Julia sparse matrix implementation is quite flexible and allows you to easily do such things.
julia> s = sparse(int32(1:10), int32(1:10), 1.0);
julia> typeof(s)
SparseMatrixCSC{Float64,Int32} (constructor with 1 method)
julia> s = sparse(int32(1:10), int32(1:10), float32(1.0));
julia> typeof(s)
SparseMatrixCSC{Float32,Int32} (constructor with 1 method)
-viral
Sorry for pointing out a probably obvious problem, but as there are others that might try debug this issue on their laptop, I ask how much memory do you have? 700000*700 floats + indexes, will spend a minimum of 11 GB (if my math is correct) and possibly more if the asymptotic storage requirement is more than 2 Int64 + 1 Float64 per stored value.
Ivar
Creating sparse arrays seems exceptionally slow.
I can set up the non-zero data of the array relatively quickly. For example, the following code takes about 80 seconds on one machine.
vec_len = 700000
row_ind = Uint64[]
col_ind = Uint64[]
value = Float64[]
for j = 1:700000
for k = 1:700
ind = k*50
push!(row_ind, ind)
push!(col_ind, j)
push!(value, 5.0)
end
end
but then
a = sparse(row_ind, col_ind, value, 700000, 700000)
takes more than at least about 30 minutes. (I never let it finish.)
It doesn't seem like the numbers I'm using should be that far off the scale. Is there a more efficient way I should be doing what I'm doing? Am I missing something and asking for something that really is impractical?
If not, I may be able to look into the sparse matrix code a little this weekend.
The never-finishing result is the same if I try
sprand(700000, 700000, .001)
or if I try to set 700000*700 values in a sparse matrix of zeros directly. Thanks.
Viral Shah
2014-05-01 09:29:34 UTC
Permalink
Let's continue this discussion at

https://github.com/JuliaLang/julia/issues/6708

-viral
Post by Viral Shah
It could be because of memory usage. I have 1 TB RAM on the machine I was
doing. If you were running into swap, it would certainly take much longer.
I will try the other version as soon as the machine is available for me to
use (some admin issues), and also look into speeding things up if possible.
If the generation of the data is in your control, you can just generate it
pre-sorted or in CSC format. I just need to check if we can shortcut
pre-sorted data and generate the sparse matrix quickly.
-viral
Post by Ryan Gardner
Hmmm. That is much better than I was getting. Thanks Viral.
Was it much faster for you to create the column-index, row-index, and
value arrays? I would still expect them to be roughly on par in terms of
speed.
Post by Ryan Gardner
I ran the sprand example, and it took 290 seconds on a machine with
enough RAM. Given that it is creating a matrix with half a billion
nonzeros, this doesn’t sound too bad.
Post by Ryan Gardner
-viral
Post by Ryan Gardner
I've got 16GB of RAM on this machine. Largely, my question, with
admittedly little knowledge of the internal structure of the sparse arrays,
is why generating the actual SparseMatrixCSC is so much slower than
generating what is essentially another sparse matrix representation
consisting of the indices and values. (I realize that once we start
swapping, which will happen in my example, things slow down a ton, but even
the sprand I mention was slow.) Do you observe the same results? Is the
reason for the difference clear to someone else?
Post by Ryan Gardner
Post by Ryan Gardner
Thanks for all the comments. These are helpful. It had not crossed
my mind that I could control the data type of the indices.
Post by Ryan Gardner
Post by Ryan Gardner
Using the SparseMatrixCSC constructor directly would probably be very
helpful. Do you learn about that constructor from looking at source code
or do you see it somewhere else?
Post by Ryan Gardner
Post by Ryan Gardner
If you're assembling the matrix in row-sorted column-major order and
there's no duplication, then you can also skip the conversion work by using
the SparseMatrixCSC constructor directly.
Post by Ryan Gardner
Post by Ryan Gardner
Could you post your code? Will avoid me writing the same. :-)
Was building the vectors taking all the time, or was it in building
the sparse matrix from the triples? Triples to CSC conversion is an
expensive operation, and we have spent a fair amount of time making it
fast. Of course, there could be more opportunities at speeding the code.
Post by Ryan Gardner
Post by Ryan Gardner
-viral
Post by Dominique Orban
Downgrading the 700,000 to 70,000 for the sake of not waiting all
night, the original implementation takes about 4.3 seconds on my laptop.
sparse matrix takes about 3.8 seconds. This may be due to conversion from
triple to csc format?!
Post by Ryan Gardner
Post by Ryan Gardner
Post by Dominique Orban
ps: using the original size of 700,000, Julia reports a memory usage
of 11.8GB.
Post by Ryan Gardner
Post by Ryan Gardner
Post by Dominique Orban
I believe the memory requirement should be 700000*700*16 (64-bit
nonzeros and row indices) + 700001*8 (64-bit column pointers) = 7.8 GB.
Post by Ryan Gardner
Post by Ryan Gardner
Post by Dominique Orban
This can be brought down a bit by using 32-bit index values and
64-bit floats, but then you need 5.8 GB. Finally, if you use 32-bit index
values with 32-bit floats, you can come down to 4GB. The Julia sparse
matrix implementation is quite flexible and allows you to easily do such
things.
Post by Ryan Gardner
Post by Ryan Gardner
Post by Dominique Orban
julia> s = sparse(int32(1:10), int32(1:10), 1.0);
julia> typeof(s)
SparseMatrixCSC{Float64,Int32} (constructor with 1 method)
julia> s = sparse(int32(1:10), int32(1:10), float32(1.0));
julia> typeof(s)
SparseMatrixCSC{Float32,Int32} (constructor with 1 method)
-viral
Sorry for pointing out a probably obvious problem, but as there are
others that might try debug this issue on their laptop, I ask how much
memory do you have? 700000*700 floats + indexes, will spend a minimum of 11
GB (if my math is correct) and possibly more if the asymptotic storage
requirement is more than 2 Int64 + 1 Float64 per stored value.
Post by Ryan Gardner
Post by Ryan Gardner
Post by Dominique Orban
Ivar
kl. 01:46:22 UTC+2 onsdag 30. april 2014 skrev Ryan Gardner
Creating sparse arrays seems exceptionally slow.
I can set up the non-zero data of the array relatively quickly. For
example, the following code takes about 80 seconds on one machine.
Post by Ryan Gardner
Post by Ryan Gardner
Post by Dominique Orban
vec_len = 700000
row_ind = Uint64[]
col_ind = Uint64[]
value = Float64[]
for j = 1:700000
for k = 1:700
ind = k*50
push!(row_ind, ind)
push!(col_ind, j)
push!(value, 5.0)
end
end
but then
a = sparse(row_ind, col_ind, value, 700000, 700000)
takes more than at least about 30 minutes. (I never let it finish.)
It doesn't seem like the numbers I'm using should be that far off
the scale. Is there a more efficient way I should be doing what I'm doing?
Am I missing something and asking for something that really is
impractical?
Post by Ryan Gardner
Post by Ryan Gardner
Post by Dominique Orban
If not, I may be able to look into the sparse matrix code a little
this weekend.
Post by Ryan Gardner
Post by Ryan Gardner
Post by Dominique Orban
The never-finishing result is the same if I try
sprand(700000, 700000, .001)
or if I try to set 700000*700 values in a sparse matrix of zeros
directly. Thanks.
Loading...