This module provides methods for clustering, histograms and other statistical computations.
This module provides code to compute clusterings. Adaptors are provided that allow easy clustering of points, and configurations of models in IMP::ConfigurationSet objects among other things.
Examples:
Keren Lasker, Daniel Russel
SVN 8931
LGPL. This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
IMP and how to apply them to biological problems.Data Structures | |
| class | ConfigurationSetRMSDMetric |
| class | ConfigurationSetXYZEmbedding |
| Embed a configuration using the XYZ coordinates of a set of particles. More... | |
| class | Embedding |
| Map clustering data to spatial positions. More... | |
| class | Histogram |
| Histogram. More... | |
| class | Metric |
| Compute a distance between two elements to be clustered. More... | |
| class | ParticleEmbedding |
| class | PartitionalClustering |
| The base class for clusterings of data sets. More... | |
| class | PartitionalClusteringWithCenter |
| class | VectorDEmbedding |
| Simply return the coordinates of a VectorD. More... | |
Functions | |
| PartitionalClusteringWithCenter * | create_bin_based_clustering (Embedding *embed, double side) |
| PartitionalClustering * | create_centrality_clustering (Metric *d, double far, int k) |
| PartitionalClustering * | create_centrality_clustering (Embedding *d, double far, int k) |
| PartitionalClusteringWithCenter * | create_connectivity_clustering (Embedding *embed, double dist) |
| PartitionalClusteringWithCenter * | create_lloyds_kmeans (Embedding *embedding, unsigned int k, unsigned int iterations) |
| algebra::VectorKDs | get_centroids (Embedding *d, PartitionalClustering *pc) |
| std::string | get_data_path (std::string file_name) |
| Return the path to installed data for this module. | |
| std::string | get_example_path (std::string file_name) |
| Return the path to installed example data for this module. | |
| std::string | get_module_name () |
| const VersionInfo & | get_module_version_info () |
| Ints | get_representatives (Embedding *d, PartitionalClustering *pc) |
| PartitionalClusteringWithCenter* IMP::statistics::create_bin_based_clustering | ( | Embedding * | embed, |
| double | side | ||
| ) |
The space is grided with bins of side size and all points that fall in the same grid bin are made part of the same cluster.
| PartitionalClustering* IMP::statistics::create_centrality_clustering | ( | Metric * | d, |
| double | far, | ||
| int | k | ||
| ) |
Cluster by repeatedly removing edges which have lots of shortest paths passing through them. The process is terminated when there are a set number of connected components. Other termination criteria can be added if someone proposes them.
| PartitionalClustering* IMP::statistics::create_centrality_clustering | ( | Embedding * | d, |
| double | far, | ||
| int | k | ||
| ) |
Cluster by repeatedly removing edges which have lots of shortest paths passing through them. The process is terminated when there are a set number of connected components. Other termination criteria can be added if someone proposes them.
| PartitionalClusteringWithCenter* IMP::statistics::create_connectivity_clustering | ( | Embedding * | embed, |
| double | dist | ||
| ) |
Two points,
,
are in the same cluster if there is a sequence of points
such that
.
| PartitionalClusteringWithCenter* IMP::statistics::create_lloyds_kmeans | ( | Embedding * | embedding, |
| unsigned int | k, | ||
| unsigned int | iterations | ||
| ) |
Return a k-means clustering of all points contained in the embedding (ie [0... embedding->get_number_of_embeddings())). These points are then clustered into k clusters. More iterations takes longer but produces a better clustering.
| algebra::VectorKDs IMP::statistics::get_centroids | ( | Embedding * | d, |
| PartitionalClustering * | pc | ||
| ) |
Given a clustering and an embedding, compute the centroid for each cluster
| std::string IMP::statistics::get_data_path | ( | std::string | file_name | ) |
Return the path to installed data for this module.
Each module has its own data directory, so be sure to use the version of this function in the correct module. To read the data file "data_library" that was placed in the data directory of module "mymodule", do something like
std::ifstream in(IMP::mymodule::get_data_path("data_library"));
This will ensure that the code works when IMP is installed or used via the tools/imppy.sh script.
| std::string IMP::statistics::get_example_path | ( | std::string | file_name | ) |
Return the path to installed example data for this module.
Each module has its own example directory, so be sure to use the version of this function in the correct module. For example to read the file example_protein.pdb located in the examples directory of the IMP::atom module, do
IMP::atom::read_pdb(IMP::atom::get_example_path("example_protein.pdb", model));
This will ensure that the code works when IMP is installed or used via the tools/imppy.sh script.
| Ints IMP::statistics::get_representatives | ( | Embedding * | d, |
| PartitionalClustering * | pc | ||
| ) |
Given a clustering and an embedding, compute a representatative element for each cluster.