[graph-tool] Memory requirement estimate for a large graph

Tiago de Paula Peixoto tiago at skewed.de
Mon Jul 6 14:40:52 CEST 2020


Am 06.07.20 um 13:00 schrieb Carlo Nicolini:
> Dear Tiago,
> 
> How is it possible to get an estimate for the memory requirement of a
> graph in graph-tool?

Yes, it is, and I should put this in the documentation somewhere.

> I know that graph-tool is built upon C++ and Boost, and the adjacency
> list is stored via a hash-map.

Not quite, we use an adjacency list using std::vector<>.

> Apart from the cost of storing the values of vertices indices and edges
> indices as `unsigned long`, what is the memory overhead of the
> structures used in storing the graph?

We use a vector-based bidirectional adjacency list, so each edge appears
twice. Each edge is comprised of two size_t (uint64_t) values, for the
target/source and the edge index, so we need 32 bytes per edge.

For each vertex we need a std::vector<> which is 24 bytes and a uint64_t
to separate the in-/out-lists, so we also need 32 bytes per node.

Therefore we need in total:

    (N + E) * 32 bytes

> For example, for a network of 1M vertices and 100M links without
> attributes, how much real memory should I plan to use, excluding
> temporaries?

That would be:

   3232000000 bytes = 3.01 GB

In practice you will need a little more, since std::vector<> tends to
over-commit.

Best,
Tiago

-- 
Tiago de Paula Peixoto <tiago at skewed.de>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.skewed.de/pipermail/graph-tool/attachments/20200706/98000713/attachment.asc>


More information about the graph-tool mailing list