[graph-tool] Questions about inference algorithms

Andrea Briega annbrial at gmail.com
Wed Jun 8 17:02:01 CEST 2016


Ok, thanks! I'll run the algorithm more times then to try to find the best
fit.

Best regards,


Andrea


2016-06-08 10:14 GMT+02:00 Andrea Briega <annbrial at gmail.com>:

> Thank you very much, your answers have been really helpful.  I am now on
> the last step, model selection, and I would like to be sure that I’m doing
> it right. I get the posterior odds ratio to compare two partitions throught
> this way: e^-(dl1-dl2),  with dl1 and dl2 as higher and lower description
> length respectively. I have obtained description length using
> ‘state.entropy()’ for nested models and ‘state.entropy(dl=True)’ for no
> nested ones.
> I have doubts about this because small differences in description length
> cause much lower values than 0.01, so in most cases the evidence supporting
> one of the models is decisive. I only get higher values than 0.01 if the
> difference in description length is lower than 5 units. With my data
> (24.000 nodes and 5.000.000 edges) I always obtain decisive supports,
> either when I compare different models or when I compare different runs of
> the same model. I wonder if this is rigth.
>
> Thanks again,
>
>
> Andrea
>
> 2016-05-19 9:55 GMT+02:00 Andrea Briega <annbrial at gmail.com>:
>
>> Dear Mr Peixoto,
>>
>>
>> I have just run a few analysis of the new version of your package and my
>> results totally change between v2.13 and v2.16.
>>
>> Nested_minimize_blockmodel is the one that make most relevant changes and
>> it is very difficult to get a biological explanation of the new results,
>> mainly at the superior hierarchical levels.
>> I would like to know the particular changes in these two analysis to
>> better understanding of my results. Is it possible to change any parameter
>> to run this function in a similar way to the v2.13? I used to run this
>> function on this way:
>>
>> state = minimize_nested_blockmodel_dl(g, pclabel=vprop_double,
>> overlap=False, nonoverlap_init=False, deg_corr=True, layers=False)
>>
>> And I have run the new version of the function on this way:
>>
>> state = minimize_nested_blockmodel_dl(g,
>> state_args=dict(pclabel=vprop_double), overlap=False,
>> nonoverlap_init=False,  deg_corr=True,  layers=False)
>>
>>
>> Thank you very much,
>>
>> Andrea
>>
>> 2016-05-12 18:43 GMT+02:00 Andrea Briega <annbrial at gmail.com>:
>>
>>> Thank you very much! I was wrong, I meant "state =
>>> gt.minimize_blockmodel_dl(g, pclabel=vprop_double)", it has been a mistake
>>> while I was writing the mail. So the key was the use of "state_args", I
>>> tried it but with a different notation and obviously it didn't work. Now I
>>> can go on!
>>>
>>> Thanks again,
>>>
>>>
>>> Andrea
>>>
>>>
>>> 2016-05-12 10:56 GMT+02:00 Andrea Briega <annbrial at gmail.com>:
>>>
>>>> Hi again,
>>>>
>>>> I have recently noticed the actualization of graphtool and now I am a
>>>> little bit confused about some changes. Sorry, I know my questions are very
>>>> basic. I am not familiar with these language and I have some dificulties to
>>>> get results.
>>>>
>>>> I am running inference algorithms to get the best model using different
>>>> options of model selection. I want to set pclabel in the inference
>>>> algorithms because I know a priori my network is bipartite, and next I want
>>>> to get the description length. Before actualization I did this by this way:
>>>>
>>>> vprop_double = g.new_vertex_property("int") # g is my network
>>>>  for i in range(0, 11772):
>>>>      vprop_double[g.vertex(i)] = 1
>>>>  for i in range(11773, 214221):
>>>>      vprop_double[g.vertex(i)] = 2
>>>>
>>>> state = gt.minimize_blockmodel_dl(g, pclabel=True)
>>>>
>>>> state.entropy(dl=True) # I am not sure this is the right way to get the
>>>> description length.
>>>>
>>>> But now I have some problems. First of all, minimize_blockmodel_dl
>>>> doesn't have a pclabel argument so I don't know how indicate it in the
>>>> inference algorithm. I have tried this:
>>>>
>>>> state.pclabel = vprop_double
>>>>
>>>> But I get the same result when I do "state.entropy(dl=True)" as before.
>>>> Also, I get the same result doing "state.entropy(dl=True)" or
>>>> "state.entropy()", and I don't understand why neither.
>>>>
>>>> And finally, in NestedBlockState objects I don't know to get
>>>> description length because entropy hasn't a "dl" argument. In these objects
>>>> entropy and dl are the same?
>>>>
>>>> In conclusion, I don't know how to set pclabel and to get the
>>>> description length in hierarchical models, and I am not sure if I am
>>>> getting it correctly in non-hierarchical ones.
>>>>
>>>> Sorry again for my basic questions but I can't go on because of these
>>>> problems.
>>>>
>>>> Thank you very much!
>>>>
>>>> Best regards,
>>>>
>>>>
>>>>
>>>>
>>>> Andrea
>>>>
>>>>
>>>>
>>>>
>>>> 2016-05-10 11:41 GMT+02:00 Andrea Briega <annbrial at gmail.com>:
>>>>
>>>>> Thank you very much! your answer has been really helpful, now I
>>>>> understand this much better. I'll think about the options you said.
>>>>>
>>>>> Thanks again,
>>>>>
>>>>>
>>>>> Andrea
>>>>>
>>>>> 2016-05-09 16:33 GMT+02:00 Andrea Briega <annbrial at gmail.com>:
>>>>>
>>>>>> Dear Dr Peixoto,
>>>>>>
>>>>>>
>>>>>> I would like to solve some questions I have about inference
>>>>>> algorithms for the identification of large-scale network structure via the
>>>>>> statistical inference of generative models.
>>>>>>
>>>>>> Minimize_blockmodel algorithm takes an hour to finish using my
>>>>>> network  with 21000 nodes (like the hierarchical version), and it spends
>>>>>> two days and a half with overlap. However, I have run an hierarchical
>>>>>> analysis with overlap, and it is still running since 14 days ago. So my
>>>>>> first question is: is this time normal, or maybe there is any problem? Do
>>>>>> you know how long could it ussually takes?
>>>>>>
>>>>>> Secondly, I have repeated some of these analysis with exactly same
>>>>>> options but I get different solutions (similar but different), so I wonder
>>>>>> if the algorithm is heuristic (I thought it was exact).
>>>>>>
>>>>>> My last question question regards bipartite analysis. I have two
>>>>>> types of nodes in my network and I wonder if there are any analytical
>>>>>> difference when running these algorithms with the bipartite option
>>>>>> (clabel=True, and different labels in each group of nodes) or not, because
>>>>>> it seems that the program “knows” my network is bipartite in any case. If
>>>>>> there are differences between bipartite and “unipartite” analysis
>>>>>> (clabel=False), is it possible to compare description length between them
>>>>>> to model selection?
>>>>>>
>>>>>> Thank you very much for your help!
>>>>>>
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>>
>>>>>>
>>>>>> Andrea
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.skewed.de/pipermail/graph-tool/attachments/20160608/fdce5edc/attachment.htm>


More information about the graph-tool mailing list