[graph-tool] Questions about inference algorithms
Tiago de Paula Peixoto
tiago at skewed.de
Wed Jun 8 10:29:16 CEST 2016
On 08.06.2016 10:14, Andrea Briega wrote:
> Thank you very much, your answers have been really helpful. I am now
> on the last step, model selection, and I would like to be sure that
> I’m doing it right. I get the posterior odds ratio to compare two
> partitions throught this way: e^-(dl1-dl2), with dl1 and dl2 as higher
> and lower description length respectively. I have obtained description
> length using ‘state.entropy()’ for nested models and
> ‘state.entropy(dl=True)’ for no nested ones.
This is correct. Note that in current versions of graph-tool you can
also just call state.entropy() for non-nested models, since we have
dl=True per default.
> I have doubts about this because small differences in description
> length cause much lower values than 0.01, so in most cases the
> evidence supporting one of the models is decisive. I only get higher
> values than 0.01 if the difference in description length is lower than
> 5 units. With my data (24.000 nodes and 5.000.000 edges) I always
> obtain decisive supports, either when I compare different models or
> when I compare different runs of the same model. I wonder if this is
This is indeed expected if you have lots of data (i.e. large
networks). For sufficient data, the evidence for the better model should
always become decisive, as long as the models being compared are indeed
distinguishable. 5 million edges is quite a bit, and indeed I would
expect the posterior odds to be quite small in this situation.
You just have to make sure that you found the best fit (i.e. smaller
description length) for each model you are comparing, by running the
algorithm as many times as you can.
Tiago de Paula Peixoto <tiago at skewed.de>
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 819 bytes
Desc: OpenPGP digital signature
More information about the graph-tool