skbio.stats.distance.DissimilarityMatrix¶
- class skbio.stats.distance.DissimilarityMatrix(data, ids=None, validate=True)[source]¶
Store dissimilarities between objects.
A DissimilarityMatrix instance stores a square, hollow, two-dimensional matrix of dissimilarities between objects. Objects could be, for example, samples or DNA sequences. A sequence of IDs accompanies the dissimilarities.
Methods are provided to load and save dissimilarity matrices from/to disk, as well as perform common operations such as extracting dissimilarities based on object ID.
- Parameters:
data (array_like or DissimilarityMatrix) – Square, hollow, two-dimensional
numpy.ndarrayof dissimilarities (floats), or a structure that can be converted to anumpy.ndarrayusingnumpy.asarrayor a one-dimensional vector of dissimilarities (floats), as defined by scipy.spatial.distance.squareform. Can instead be a DissimilarityMatrix (or subclass) instance, in which case the instance’s data will be used. Data will be converted to a floatdtypeif necessary. A copy will not be made if already anumpy.ndarraywith a floatdtype.ids (sequence of str, optional) – Sequence of strings to be used as object IDs. Must match the number of rows/cols in data. If
None(the default), IDs will be monotonically-increasing integers cast as strings, with numbering starting from zero, e.g.,('0', '1', '2', '3', ...).validate (bool, optional) – If validate is
True(the default) and data is not a DissimilarityMatrix object, the input data will be validated.
See also
DistanceMatrix,scipy.spatial.distance.squareformNotes
The dissimilarities are stored in redundant (square-form) format [1].
The data are not checked for symmetry, nor guaranteed/assumed to be symmetric.
References
Attributes
TTranspose of the dissimilarity matrix.
dataArray of dissimilarities.
default_write_formatdtypeData type of the dissimilarities.
idsTuple of object IDs.
pngDisplay heatmap in IPython Notebook as PNG.
shapeTwo-element tuple containing the dissimilarity matrix dimensions.
sizeTotal number of elements in the dissimilarity matrix.
svgDisplay heatmap in IPython Notebook as SVG.
Built-ins
__contains__(lookup_id)Check if the specified ID is in the dissimilarity matrix.
__eq__(other)Compare this dissimilarity matrix to another for equality.
__ge__(value, /)Return self>=value.
__getitem__(index)Slice into dissimilarity data by object ID or numpy indexing.
Helper for pickle.
__gt__(value, /)Return self>value.
__le__(value, /)Return self<=value.
__lt__(value, /)Return self<value.
__ne__(other)Determine whether two dissimilarity matrices are not equal.
__str__()Return a string representation of the dissimilarity matrix.
Methods
between(from_, to_[, allow_overlap])Obtain the distances between the two groups of IDs
copy()Return a deep copy of the dissimilarity matrix.
filter(ids[, strict])Filter the dissimilarity matrix by IDs.
from_iterable(iterable, metric[, key, keys])Create DissimilarityMatrix from an iterable given a metric.
index(lookup_id)Return the index of the specified ID.
plot([cmap, title])Creates a heatmap of the dissimilarity matrix
read(file[, format])Create a new
DissimilarityMatrixinstance from a file.Return an array of dissimilarities in redundant format.
Create a
pandas.DataFramefrom thisDissimilarityMatrix.Return the transpose of the dissimilarity matrix.
within(ids)Obtain all the distances among the set of IDs
write(file[, format])Write an instance of
DissimilarityMatrixto a file.