Detecting Communities with Louvain Method and VOS Clustering
Detecting communities (Pajek and PajekXXL)
Louvain community detection algorithm is available in Pajek and PajekXXL 3.02 or later.
From version 3.04 on, the implementation offers the resolution parameter. In this way users have control
over the size and number of communities found (resolution 1 means standard Louvain method,
higher resolutions produce larger number of clusters, lower resolutions produce lower number of clusters).
In this version the standard Louvain algorithm was replaced by Multi-Level Coarsening + Multi-Level Refinement algorithm.
From version 3.05 on, number of restarts parameter is included. That enables running
the optimization several times and selecting the best partition over all runs.
From version 3.05 on, another community detection algorithm (VOS Clustering) is available.
The usage is very similar to usage of Louvain method, therefore we will explain usage
only for Louvain method.
In Louvain method modularity is optimized while in VOS Clustering VOS quality.
Comparison of results obtained by both methods can be found
here.
Both algorithms are very fast and can be applied to huge sparse networks containing 100s of millions of vertices.
Values of lines (if any) are also taken into account in both algoritms.
Two algorithms are available (for details see: Multilevel local search algorithms for modularity clustering):
- Multi-Level Coarsening + Single Refinement - performs only refinement of the partition obtained in the last level (the coarsest partition).
- Multi-Level Coarsening + Multi-Level Refinement - iterativelly performs coarsening and refinement phase for each obtained level.
Sequence of steps in Pajek
- Download sample network file (25069 vertices, 62608 edges) and load it to Pajek/PajekXXL.
- Start community search: Network/Create Partition/Communities/Louvain Method
Usually several levels are needed.
Pajek returns the best partition according to all levels.
Number of clusters (NC) in levels decreases (smaller clusters are fused to larger ones in later levels).
On the other hand modularity (Q)
(or VOS Quality)
of the partition (which is reported together with the number of clusters) increases.
- Try the algorithm with different values of resolution parameter
(resolution 1 means standard Louvain method, higher resolutions produce larger number of clusters, lower resolutions produce lower number of clusters).
To find as good (and as many) solutions as possible in algorithm vertices are taken into account randomly.
Because of that the algorithm usually returns different results in each execution.
Therefore it is recommended to run the algorithm with several restarts which selects the best partition from all restarts.
- Recommendation: Compare the partitions obtained in two runs with the same resolution parameter (using Partitions/Info/Cramer's V, Rajski, Adjusted Rand Index).
If the correlation of the two partitions is small, the number of communities is probably not the right one,
therefore we suggest to try the algorithm with another (larger or smaller) value of resolution parameter.
In our case we get the following results for values of resolution parameter 1.00, 0.50, and 40.00 respectively:
Resolution: 1.00. Modularity: 0.935506. Number of Communities: 166.
Resolution: 0.50. Modularity: 0.938871. Number of Communities: 105.
Resolution: 40.00. Modularity: 0.852442. Number of Communities: 500.
The correlation among partitions obtained with the same value of resolution parameter is the highest for
resolution=40.00 (Cramer's V=0.998) therefore we will use this communities as the right ones
(although the modularity is the smallest for this value of resolution parameter).
Important: Modularity can be used only for comparisons of partitions obtained with the same value of resolution parameter.
- We can adjust Maximum Number of Iterations in each Restart, Maximum Number of Levels in each Iteration allowed
and Maximum Number of Repetitions in each Level allowed.
The default values (20, 20 and 50 respectivelly) works fine for most of the networks.
Note that the first level takes most of the time, later levels are done very quickly,
especially if number of clusters identified in the first level is already low according to number of vertices
(algorithm is execuded on shrunken networks in later levels).
- We can use Operations/Network+Partition/Info to compute
network modularity according to partition
or VOS Quality of the partition.
It can be used on any partition (not only on partitions obtained by Louvain method or VOS Clustering).
- In case of a signed network (at least one line value is negative) a special version of Louvain algorithm is called
(maximizing sum of positive and minimizing sum of negative lines inside communities).
On the other hand in VOS Clustring all line values are considered as positive (absolute line value are taken into account).
Visualizing communities
Comparing Louvain method and VOS Clustering
Back to Pajek and Pajek-XXL Main page.
|