Variation of Information
Variation of information (also known as shared information distance) is a measure of the distance between the two clusterings. It is devised from the mutual information, but it is a true metric, i.e. it is symmetric and satisfies the triangle inequality. See
Meila, Marina (2003). Comparing Clusterings by the Variation of Information. Learning Theory and Kernel Machines: 173โ187.
Clustering.varinfo โ Function.varinfo(k1::Int, a1::AbstractVector{Int}, k2::Int, a2::AbstractVector{Int})
varinfo(R::ClusteringResult, k0::Int, a0::AbstractVector{Int})
varinfo(R1::ClusteringResult, R2::ClusteringResult)Compute the variation of information between the two clusterings.
Each clustering is provided either as an instance of ClusteringResult subtype or as a pair of arguments:
- a number of clusters (
k1,k2,k0) - a vector of point to cluster assignments (
a1,a2,a0).