[graph-tool] Building graph from adjacency matrix

Tiago de Paula Peixoto tiago at skewed.de
Thu Mar 20 12:52:24 CET 2014


On 03/20/2014 11:30 AM, Pietro Battiston wrote:
> Hello,
>
> I tried graph-tool after some experience with networkx, in search for
> better efficiency (attracted by [1]) ... but unfortunately, it seems
> like it all depends on how data is stored!

This is true. If your code essentially depends on python loops,
graph-tool will not provide any advantage.

> The (several) networks I play with are dense weighted networks of 1435
> nodes. They are defined in terms of their adjacency matrices, which are
> pandas dataframes stored in a HDF5 store.
>
> In order to get the relative network with networkx, I would do
> dg = nx.from_numpy_matrix(hstore['my_df'].as_matrix(),
>                           create_using=nx.DiGraph())
>
> which took around 18 seconds. In graph-tool, the best alternative I
> found, based on [2] and [3], is:
>
>
> # Since the bottleneck is given by the calls to g.add_edge(),
> # numba actually slows down things a bit.
> #from numba import autojit
> #
> #@autojit
> def graph_from_adj_matrix(m):
>     g = Graph()
>
>     n = m.shape[0]
>
>     edges = list(g.add_vertex(n=n))
>
>     c = Counter(every=10, until=n)
>     for i in xrange(n):
>         for j in xrange(n):
>             if m[i,j]:
>                 g.add_edge(edges[i], edges[j])
>         c.count()
>
>     return g
>
> g = graph_from_adj_matrix(hstore['my_df'].as_matrix())
>
>
> ... which takes around 200 seconds (and is still not even weighted!).
>
> Is there anything trivial (or simply new) I'm missing (apart from the
> possibility, mentioned in [3], of writing my code in C++)?

The ability of inserting many edges without calling add_edge() several
times is indeed lacking...

Because of this, I have now just added a Graph.add_edge_list() to the
git version, which takes a list of edges to be added, which can be a
numpy array. If you have a full adjacency matrix instead of an edge
list, you can do simply:

    g.add_edge_list(transpose(nonzero(a)))

This should be much faster than the Python loop above.

Best,
Tiago

-- 
Tiago de Paula Peixoto <tiago at skewed.de>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 901 bytes
Desc: OpenPGP digital signature
URL: <https://lists.skewed.de/pipermail/graph-tool/attachments/20140320/eb6e0c72/attachment.asc>


More information about the graph-tool mailing list