dacapo.utils.voi

Functions

`voi`(reconstruction, groundtruth[, ...])	Return the conditional entropies of the variation of information metric. [1]
`split_vi`(x[, y, ignore_x, ignore_y])	Return the symmetric conditional entropies associated with the VI.
`vi_tables`(x[, y, ignore_x, ignore_y])	Return probability tables used for calculating VI.
`contingency_table`(seg, gt[, ignore_seg, ignore_gt, norm])	Return the contingency table for all regions in matched segmentations.
`divide_columns`(matrix, row[, in_place])	Divide each column of matrix by the corresponding element in row.
`divide_rows`(matrix, column[, in_place])	Divide each row of matrix by the corresponding element in column.
`xlogx`(x[, out, in_place])	Compute x * log_2(x).

Module Contents

dacapo.utils.voi.voi(reconstruction, groundtruth, ignore_reconstruction=[], ignore_groundtruth=[0])

Return the conditional entropies of the variation of information metric. [1]

Let X be a reconstruction, and Y a ground truth labelling. The variation of information between the two is the sum of two conditional entropies:

VI(X, Y) = H(X|Y) + H(Y|X).

The first one, H(X|Y), is a measure of oversegmentation, the second one, H(Y|X), a measure of undersegmentation. These measures are referred to as the variation of information split or merge error, respectively.

Parameters:

seg (np.ndarray, int type, arbitrary shape) – A candidate segmentation.
gt (np.ndarray, int type, same shape as seg) – The ground truth segmentation.
ignore_seg (list of int, optional) – Any points having a label in this list are ignored in the evaluation. By default, only the label 0 in the ground truth will be ignored.
ignore_gt (list of int, optional) – Any points having a label in this list are ignored in the evaluation. By default, only the label 0 in the ground truth will be ignored.

Returns:

(split, merge) – The variation of information split and merge error, i.e., H(X|Y) and H(Y|X)

Return type:

float

Raises:

ValueError – If reconstruction and groundtruth have different shapes.

References

[1] Meila, M. (2007). Comparing clusterings - an information based distance. Journal of Multivariate Analysis 98, 873-895.

dacapo.utils.voi.split_vi(x, y=None, ignore_x=[0], ignore_y=[0])

Return the symmetric conditional entropies associated with the VI.

The variation of information is defined as VI(X,Y) = H(X|Y) + H(Y|X). If Y is the ground-truth segmentation, then H(Y|X) can be interpreted as the amount of under-segmentation of Y and H(X|Y) is then the amount of over-segmentation. In other words, a perfect over-segmentation will have H(Y|X)=0 and a perfect under-segmentation will have H(X|Y)=0.

If y is None, x is assumed to be a contingency table.

Parameters:

x (np.ndarray) – Label field (int type) or contingency table (float). x is interpreted as a contingency table (summing to 1.0) if and only if y is not provided.
y (np.ndarray of int, same shape as x, optional) – A label field to compare to x.
ignore_x (list of int, optional) – Any points having a label in this list are ignored in the evaluation. Ignore 0-labeled points by default.
ignore_y (list of int, optional) – Any points having a label in this list are ignored in the evaluation. Ignore 0-labeled points by default.

Returns:

sv – The conditional entropies of Y|X and X|Y.

Return type:

np.ndarray of float, shape (2,)