Variation of Information
Variation of information (also known as shared information distance) is a measure of the distance between the two clusterings. It is devised from the mutual information, but it is a true metric, i.e. it is symmetric and satisfies the triangle inequality. See
Meila, Marina (2003). Comparing Clusterings by the Variation of Information. Learning Theory and Kernel Machines: 173โ187.
Clustering.varinfo
โ Function.varinfo(k1::Int, a1::AbstractVector{Int}, k2::Int, a2::AbstractVector{Int})
varinfo(R::ClusteringResult, k0::Int, a0::AbstractVector{Int})
varinfo(R1::ClusteringResult, R2::ClusteringResult)
Compute the variation of information between the two clusterings.
Each clustering is provided either as an instance of ClusteringResult
subtype or as a pair of arguments:
- a number of clusters (
k1
,k2
,k0
) - a vector of point to cluster assignments (
a1
,a2
,a0
).