[graph-tool] Questions about output of "history" of graph_tool.inference.mcmc_equilibrate
P-M
pmj27 at cam.ac.uk
Tue Feb 21 12:07:33 CET 2017
Hi Tiago,
I have not reproduced the same problem yet, but a different problem for the
history with a smaller graph which I can upload. I ran the following piece
of code (this is a deliberately small network so ignore the actual results
of the code):
import graph_tool.all as gt
import timeit
import random
import cPickle as pickle
def collect_edge_probs(s):
for i in range(len(missing_edges)):
p = s.get_edges_prob([missing_edges[i]],
entropy_args=dict(partition_dl=False))
probs[i].append(p)
g = gt.load_graph('graph_no_multi_clean.gt')
pub_years = [1800]
vertex_filter = g.new_vertex_property("bool")
edge_filter = g.new_edge_property("bool")
for pub_year in pub_years:
#Initiliase parallel edges filter
parallel_edges_filter= g.new_edge_property("int",val=0)
#filter vertices by date
for v in g.vertices():
if g.vp.v_pub_year[v] <= pub_year:
vertex_filter[v] = True
else:
vertex_filter[v] = False
g.set_vertex_filter(vertex_filter)
#now filter edges by date
for e in g.edges():
if g.ep.pub_year[e] <= pub_year:
edge_filter[e] = True
else:
edge_filter[e] = False
g.set_edge_filter(edge_filter)
#cannot simply delete all parallel edges as that might prevent accurate
#filtering of edges by date in the next step
gt.label_parallel_edges(g,eprop=parallel_edges_filter)
for e in g.edges():
if parallel_edges_filter[e] != 0:
edge_filter[e] = False
g.set_edge_filter(edge_filter)
remaining_v_indices = []
for v in g.vertices():
remaining_v_indices.append(int(g.vertex_index[v]))
num_vertices = g.num_vertices()
random_origins = random.sample(remaining_v_indices,
int(0.01*num_vertices))
random_targets = random.sample(remaining_v_indices,
int(0.01*num_vertices))
missing_edges = []
for v1 in random_origins:
for v2 in random_targets:
if v1==v2:
continue
elif g.edge(v1,v2) == None:
missing_edges.append((v1,v2))
state = gt.minimize_nested_blockmodel_dl(g, deg_corr=True)
state = state.copy(sampling=True)
probs = [[] for _ in range(len(missing_edges))]
mcmc_args=dict(niter=10)
# Now we collect the probabilities for exactly 10,000 sweeps
history = gt.mcmc_equilibrate(state, force_niter=1000,
mcmc_args=mcmc_args,
callback=collect_edge_probs,history=True)
name = 'history'+str(g.num_vertices())+'.pkl'
with open(name,'wb') as missing_edges_pkl:
pickle.dump(history,missing_edges_pkl,-1)
#undo filtering
g.set_edge_filter(None)
g.set_vertex_filter(None)
Now when looking at the output of `history` I find that the output for every
entry is [7842.8484318875344, a] where `a` is some single-digit integer.
Given that the expected format is [iteration,entropy] I can't quite make
sense of it as the first entry is always the same and a decimal number
wasn't quite what I expected for an iteration counter. The last number
however also doesn't work as an interation counter as it doesn't seem to
straightforwardly increment. Do you know what is going wrong here? Is this
maybe a similar issue to what I had observed previously? I have attached the
history output here ( history1023.pkl
<http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/file/n4027051/history1023.pkl>
) and the graph as a zipped file here as it was too large otherwise (
graph_no_multi_clean.zip
<http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/file/n4027051/graph_no_multi_clean.zip>
).
--
View this message in context: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/Questions-about-output-of-history-of-graph-tool-inference-mcmc-equilibrate-tp4027027p4027051.html
Sent from the Main discussion list for the graph-tool project mailing list archive at Nabble.com.
More information about the graph-tool
mailing list