dacapo.utils.voi
Functions
|
Return the conditional entropies of the variation of information metric. [1] |
|
Return the symmetric conditional entropies associated with the VI. |
|
Return probability tables used for calculating VI. |
|
Return the contingency table for all regions in matched segmentations. |
|
Divide each column of matrix by the corresponding element in row. |
|
Divide each row of matrix by the corresponding element in column. |
|
Compute x * log_2(x). |
Module Contents
- dacapo.utils.voi.voi(reconstruction, groundtruth, ignore_reconstruction=[], ignore_groundtruth=[0])
Return the conditional entropies of the variation of information metric. [1]
Let X be a reconstruction, and Y a ground truth labelling. The variation of information between the two is the sum of two conditional entropies:
VI(X, Y) = H(X|Y) + H(Y|X).
The first one, H(X|Y), is a measure of oversegmentation, the second one, H(Y|X), a measure of undersegmentation. These measures are referred to as the variation of information split or merge error, respectively.
- Parameters:
seg (np.ndarray, int type, arbitrary shape) – A candidate segmentation.
gt (np.ndarray, int type, same shape as seg) – The ground truth segmentation.
ignore_seg (list of int, optional) – Any points having a label in this list are ignored in the evaluation. By default, only the label 0 in the ground truth will be ignored.
ignore_gt (list of int, optional) – Any points having a label in this list are ignored in the evaluation. By default, only the label 0 in the ground truth will be ignored.
- Returns:
(split, merge) – The variation of information split and merge error, i.e., H(X|Y) and H(Y|X)
- Return type:
float
- Raises:
ValueError – If reconstruction and groundtruth have different shapes.
References
[1] Meila, M. (2007). Comparing clusterings - an information based distance. Journal of Multivariate Analysis 98, 873-895.
- dacapo.utils.voi.split_vi(x, y=None, ignore_x=[0], ignore_y=[0])
Return the symmetric conditional entropies associated with the VI.
The variation of information is defined as VI(X,Y) = H(X|Y) + H(Y|X). If Y is the ground-truth segmentation, then H(Y|X) can be interpreted as the amount of under-segmentation of Y and H(X|Y) is then the amount of over-segmentation. In other words, a perfect over-segmentation will have H(Y|X)=0 and a perfect under-segmentation will have H(X|Y)=0.
If y is None, x is assumed to be a contingency table.
- Parameters:
x (np.ndarray) – Label field (int type) or contingency table (float). x is interpreted as a contingency table (summing to 1.0) if and only if y is not provided.
y (np.ndarray of int, same shape as x, optional) – A label field to compare to x.
ignore_x (list of int, optional) – Any points having a label in this list are ignored in the evaluation. Ignore 0-labeled points by default.
ignore_y (list of int, optional) – Any points having a label in this list are ignored in the evaluation. Ignore 0-labeled points by default.
- Returns:
sv – The conditional entropies of Y|X and X|Y.
- Return type:
np.ndarray of float, shape (2,)
See also
vi
- dacapo.utils.voi.vi_tables(x, y=None, ignore_x=[0], ignore_y=[0])
Return probability tables used for calculating VI.
If y is None, x is assumed to be a contingency table.
- Parameters:
x (np.ndarray) – Either x and y are provided as equal-shaped np.ndarray label fields (int type), or y is not provided and x is a contingency table (sparse.csc_matrix) that may or may not sum to 1.
y (np.ndarray) – Either x and y are provided as equal-shaped np.ndarray label fields (int type), or y is not provided and x is a contingency table (sparse.csc_matrix) that may or may not sum to 1.
ignore_x (list of int, optional) – Rows and columns (respectively) to ignore in the contingency table. These are labels that are not counted when evaluating VI.
ignore_y (list of int, optional) – Rows and columns (respectively) to ignore in the contingency table. These are labels that are not counted when evaluating VI.
- Returns:
pxy (sparse.csc_matrix of float) – The normalized contingency table.
px, py, hxgy, hygx, lpygx, lpxgy (np.ndarray of float) – The proportions of each label in x and y (px, py), the per-segment conditional entropies of x given y and vice-versa, the per-segment conditional probability p log p.
- Raises:
ValueError – If x and y have different shapes.
- dacapo.utils.voi.contingency_table(seg, gt, ignore_seg=[0], ignore_gt=[0], norm=True)
Return the contingency table for all regions in matched segmentations.
- Parameters:
seg (np.ndarray, int type, arbitrary shape) – A candidate segmentation.
gt (np.ndarray, int type, same shape as seg) – The ground truth segmentation.
ignore_seg (list of int, optional) – Values to ignore in seg. Voxels in seg having a value in this list will not contribute to the contingency table. (default: [0])
ignore_gt (list of int, optional) – Values to ignore in gt. Voxels in gt having a value in this list will not contribute to the contingency table. (default: [0])
norm (bool, optional) – Whether to normalize the table so that it sums to 1.
- Returns:
cont – A contingency table. cont[i, j] will equal the number of voxels labeled i in seg and j in gt. (Or the proportion of such voxels if norm=True.)
- Return type:
scipy.sparse.csc_matrix
- Raises:
ValueError – If seg and gt have different shapes.
- dacapo.utils.voi.divide_columns(matrix, row, in_place=False)
Divide each column of matrix by the corresponding element in row.
The result is as follows: out[i, j] = matrix[i, j] / row[j]
- Parameters:
matrix (np.ndarray, scipy.sparse.csc_matrix or csr_matrix, shape (M, N)) – The input matrix.
column (a 1D np.ndarray, shape (N,)) – The row dividing matrix.
in_place (bool (optional, default False)) – Do the computation in-place.
- Returns:
out – The result of the row-wise division.
- Return type:
same type as matrix
- Raises:
ValueError – If row contains zeros.
- dacapo.utils.voi.divide_rows(matrix, column, in_place=False)
Divide each row of matrix by the corresponding element in column.
The result is as follows: out[i, j] = matrix[i, j] / column[i]
- Parameters:
matrix (np.ndarray, scipy.sparse.csc_matrix or csr_matrix, shape (M, N)) – The input matrix.
column (a 1D np.ndarray, shape (M,)) – The column dividing matrix.
in_place (bool (optional, default False)) – Do the computation in-place.
- Returns:
out – The result of the row-wise division.
- Return type:
same type as matrix
- Raises:
ValueError – If column contains zeros.
- dacapo.utils.voi.xlogx(x, out=None, in_place=False)
Compute x * log_2(x).
We define 0 * log_2(0) = 0
- Parameters:
x (np.ndarray or scipy.sparse.csc_matrix or csr_matrix) – The input array.
out (same type as x (optional)) – If provided, use this array/matrix for the result.
in_place (bool (optional, default False)) – Operate directly on x.
- Returns:
y – Result of x * log_2(x).
- Return type:
same type as x
- Raises:
ValueError – If x contains negative values.