[graph-tool] Shortest_distance complexity when used with max_dist

François francois.kawala at gmail.com
Mon Mar 9 11:28:09 UTC 2015


Typo + precision fix:

"These 34.9ms have to be compared to the 300ms (back of envelope calculus)
that takes the copy_property function in my 33M vertices graph."

2015-03-09 12:25 GMT+01:00 François Kawala <francois.kawala at gmail.com>:

> Thanks for these clarifications.
>
> I guess that most of the time is spent in the *copy_property* [link to git
> <https://git.skewed.de/count0/graph-tool/blob/master/src/graph/graph_properties_copy.cc#L33>].
> Indeed, when I provide a pre-initialized distance vector through the dist_map
> parameter, the execution of shortest_distance is only 10ms faster.
>
> If I understood well, the *copy_property *is used to build a vector
> initialized with one single value. If so, one would obtain the same result
> with python and numpy with
>
> x = np.empty(array_size, dtype=float)
> x.fill(value)
>
> I did try to time this "numpy way initialization" (altough I'm not sure it
> corresponds to *copy_property*)
>
> python -m timeit -s "import numpy as np" -n 100 "np.empty(33e6,
> dtype=int).fill(1)"
> 100 loops, best of 3: 34.9 msec per loop
>
> These 34.9ms have to be compared to the 300ms (bake of envelope calculus)
> that takes the *copy_property *function.
>
> Am I right about the way that the copy_property function works ? Could it
> be improved ?
>
> Best,
> F.
>
>
> 2015-03-08 20:11 GMT+01:00 Tiago Peixoto [via Main discussion list for the
> graph-tool project] <ml-node+s982480n4026027h94 at n3.nabble.com>:
>
>> On 07.03.2015 18:20, François wrote:
>>
>> > Consider two graphs: G=(V,E) and H=(V',E') such that:
>> >
>> >  1. V' \subset V and E' \subset E
>> >  2. |E| = 66M
>> >  3. |E'|=2M.
>> >
>> > Run:
>> >
>> >  1. X = shortest_distance(G, G.vertex(0), max_dist=x,
>> weights=G.ep['weights'])
>> >  2. Y = shortest_distance(H, H.vertex(0), max_dist=x,
>> weights=H.ep['weights'])
>> >
>> > I verify that : X = Y
>> >
>> > The execution took 320ms for (1) and 16ms for (2). However the number
>> > of visited vertices should be identical ? I expected to obtain
>> > comparable execution times.
>> This is because of the initialization of the algorithm. The algorithm
>> needs to create and initialize at least two vectors of size |V| to store
>> the distances and book-keeping, regardless of how many vertices are in
>> fact visited. Therefore the algorithm will never be faster than O(|V|),
>> independently of "max_dist".
>>
>> Here is more or less what you are describing:
>>
>> In [1]: g = lattice([10, 10])
>>
>> In [2]: %time d = shortest_distance(g, g.vertex(0), max_dist=2)
>> CPU times: user 3.33 ms, sys: 0 ns, total: 3.33 ms
>> Wall time: 2.43 ms
>>
>> In [3]: g = lattice([1000, 1000])
>>
>> In [4]: %time d = shortest_distance(g, g.vertex(0), max_dist=2)
>> CPU times: user 50 ms, sys: 0 ns, total: 50 ms
>> Wall time: 28.6 ms
>>
>> Note that the second graph is ten thousand times larger than the first
>> one, but the algorithm was only a little more than ten times slower.
>>
>> Now if we do it for the larger graph without max dist, we have:
>>
>> In [5]: %time d = shortest_distance(g, g.vertex(0))
>> CPU times: user 270 ms, sys: 0 ns, total: 270 ms
>> Wall time: 260 ms
>>
>> Which is around 10 times slower... So thing are indeed faster if
>> max_dist is specified, but the algorithm will still depend on the total
>> size of the graph.
>>
>> Best,
>> Tiago
>>
>> --
>> Tiago de Paula Peixoto <[hidden email]
>> <http:///user/SendEmail.jtp?type=node&node=4026027&i=0>>
>>
>>
>> _______________________________________________
>> graph-tool mailing list
>> [hidden email] <http:///user/SendEmail.jtp?type=node&node=4026027&i=1>
>> http://lists.skewed.de/mailman/listinfo/graph-tool
>>
>> *signature.asc* (836 bytes) Download Attachment
>> <http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/attachment/4026027/0/signature.asc>
>>  --
>> Tiago de Paula Peixoto <tiago at skewed.de>
>>
>>
>> ------------------------------
>>  If you reply to this email, your message will be added to the
>> discussion below:
>>
>> http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/Shortest-distance-complexity-when-used-with-max-dist-tp4026018p4026027.html
>>  To start a new topic under Main discussion list for the graph-tool
>> project, email ml-node+s982480n2141189h16 at n3.nabble.com
>> To unsubscribe from Shortest_distance complexity when used with max_dist, click
>> here
>> <http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4026018&code=ZnJhbmNvaXMua2F3YWxhQGdtYWlsLmNvbXw0MDI2MDE4fDIxMTQ0MDk4Nzk=>
>> .
>> NAML
>> <http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
>
>
> --
> François Kawala
>



-- 
François Kawala




--
View this message in context: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/Shortest-distance-complexity-when-used-with-max-dist-tp4026018p4026029.html
Sent from the Main discussion list for the graph-tool project mailing list archive at Nabble.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.skewed.de/pipermail/graph-tool/attachments/20150309/da1b7294/attachment.html>


More information about the graph-tool mailing list