This module provides RMF I/O.
The library provides support for the RMF file format for storing hierarchical molecular data (such as atomic or coarse grained representations of proteins), along with markup, including geometry and score data.
The library uses the HDF5 library to manage the data on disk. Other backends (eg mmCIF) could be used, if desired.
See RMF file format for more information about the files and RMF data categories for standard data.
The RMF library provides an intermediate level interface to facilitate I/O of RMF data into and out of programs. The primary classes of interest are RMF::RootHandle representing the root of an RMF hierarchy and RMF::NodeHandle representing a node in the hierarchy.
The file is automatically closed when the last handle to it is destroyed.
The library defines many classes, some of which are implemented using C++ templates. Every class support output to a std::ostream in C++ and conversion to str in Python. In addition, every class can be compared to other instances of the same class and can be inserted in hash tables both in C++ and Python. The methods necessary to support these things are omitted for brevity.
Template classes are, in general parameterized by one of two things
Python types) are provided for 1 through 4 nodes, with names like Category, PairCategory, TripletCategory, QuadCategory. For classes parameterized by type, typedefs and Python classes are provides for all the standard types with names like IntKey, FloatKey etc. Some classes, like Key are parameterized by both, so the full cross product of types is provided with names like PairFloatKey.In addition, there is a typedef for each type for managing lists of the objects. For example, a list of Category objects is passed using a Categories type. It looks like a std::vector in C++ and is a list in Python.
The RMF wrapper has the concept of an association between nodes in its hierarchy and objects in the program accessing. The methods RMF::RootHandle::get_node_handle_from_association(), RMF::NodeHandle::set_association() and RMF::NodeHandle::get_assocation() can be used to take advantage of this. The idea is that one can store pointers to the programatic data structures corresponding to the nodes and so avoid maintaining ones own lookup table.
Traversing large RMF files can be slow, especially when doing so in scripting languages. The library provides a few methods RMF::get_values() to accelerate this process when loading values for each of a number of frames. Other such batch methods can be added as appropriate.
If RMF::RootHandle::flush() has been called since the last change, it is safe to read the file from another process. Writing from more than one process is not supported. Nor is reading or writing from more than one thread of the same program.
Currently, there is little explicit checking of invariants between attributes in the RMF file. An extensible framework for checking invariants on file close and open will be added.
The RMF library currently supports C++ and Python. The API is written so that SWIG can be used to easily generate bindings for most languages. The two main exceptions are C and Fortran. Until the SWIG C target support is finished, these can be supported by writing a simple C API manually, probably a weeks work.
The library provides a simple wrapper for the HDF5 library through the HDF5Group, HDF5DataSetD, HDF5File and helper classes.
rmf_show prints out the hierarchy written to the file.rmf_pdb converts an rmf file to or from a PDB file, assuming all hierarchies in the rmf file are atomic resolution.rmf_xml converts an rmf file to an XML files that can be opened in an XML viewer (eg Google Chrome or Firefox). These viewers support collapsing of subtrees, which makes it much easier to get around large hierarchies.See the RMF tools application for IMP-related helper programs.
Examples:
Author(s): Daniel Russel
Version: SVN.r14091
License: LGPL. This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
Publications:
IMP and how to apply them to biological problems.Helper functions | |
These functions help make working with keys and categories easier, by allowing you to assume that the key/category is always there. | |
| template<int D> | |
| CategoryD< D > | get_category_always (FileHandle fh, std::string name) |
| CategoryD< 1 > | get_singleton_category_always (FileHandle rh, std::string name) |
| CategoryD< 2 > | get_pair_category_always (FileHandle rh, std::string name) |
| CategoryD< 3 > | get_triplet_category_always (FileHandle rh, std::string name) |
| CategoryD< 4 > | get_quad_category_always (FileHandle rh, std::string name) |
| template<class TypeT , int D> | |
| Key< TypeT, D > | get_key_always (FileHandle fh, CategoryD< D > cat, std::string name, bool per_frame=false) |
| Key< TypeTraits, 1 > | get_type_key_always (FileHandle rh, CategoryD< 1 > cat, std::string name, bool per_frame=false) |
| Key< TypeTraits, 2 > | get_type_key_always (FileHandle rh, CategoryD< 2 > cat, std::string name, bool per_frame=false) |
| Key< TypeTraits, 3 > | get_type_key_always (FileHandle rh, CategoryD< 3 > cat, std::string name, bool per_frame=false) |
| Key< TypeTraits, 4 > | get_type_key_always (FileHandle rh, CategoryD< 4 > cat, std::string name, bool per_frame=false) |
Data set names | |
The RMF format stores various pieces of data in data sets and attributes attached to the HDF5 group which is acting as the root. These functions return the names for the various data sets. | |
| String | get_node_data_data_set_name () |
| Get the name of the data set storing the data about each node. | |
| String | get_node_name_data_set_name () |
| Get the name of the data set storing the name for each node. | |
| String | get_bond_data_data_set_name () |
| Get the name of the data set for storing bonds. | |
| String | get_set_data_data_set_name (int arity) |
| Get the name of the data set for storing bonds. | |
| String | get_category_name_data_set_name (int arity) |
| Get the name of the data set for storing category names. | |
| String | get_key_list_data_set_name (std::string category_name, int Arity, String type_name, bool per_frame) |
| Get the name of the attribute which lists all the keys of the category. | |
| String | get_data_data_set_name (std::string category_name, int arity, String type_name, bool per_frame) |
| Get the name of the data set for storing a particular type of data. | |
Typedefs | |
| typedef vector< AtomConstFactory > | AtomConstFactories |
| typedef vector< AtomConst > | AtomConsts |
| typedef vector< AtomFactory > | AtomFactories |
| typedef vector< Atom > | Atoms |
| typedef vector< BallConstFactory > | BallConstFactories |
| typedef vector< BallConst > | BallConsts |
| typedef vector< BallFactory > | BallFactories |
| typedef vector< Ball > | Balls |
| typedef vector< ChainConstFactory > | ChainConstFactories |
| typedef vector< ChainConst > | ChainConsts |
| typedef vector< ChainFactory > | ChainFactories |
| typedef vector< Chain > | Chains |
|
typedef vector < ColoredConstFactory > | ColoredConstFactories |
| typedef vector< ColoredConst > | ColoredConsts |
| typedef vector< ColoredFactory > | ColoredFactories |
| typedef vector< Colored > | Coloreds |
| typedef vector< CopyConstFactory > | CopyConstFactories |
| typedef vector< CopyConst > | CopyConsts |
| typedef vector< CopyFactory > | CopyFactories |
| typedef vector< Copy > | Copys |
|
typedef vector < CylinderConstFactory > | CylinderConstFactories |
| typedef vector< CylinderConst > | CylinderConsts |
| typedef vector< CylinderFactory > | CylinderFactories |
| typedef vector< Cylinder > | Cylinders |
|
typedef vector < DiffuserConstFactory > | DiffuserConstFactories |
| typedef vector< DiffuserConst > | DiffuserConsts |
| typedef vector< DiffuserFactory > | DiffuserFactories |
| typedef vector< Diffuser > | Diffusers |
|
typedef vector < DomainConstFactory > | DomainConstFactories |
| typedef vector< DomainConst > | DomainConsts |
| typedef vector< DomainFactory > | DomainFactories |
| typedef vector< Domain > | Domains |
| typedef vector< FileConstHandle > | FileConstHandles |
| typedef vector< FileHandle > | FileHandles |
| typedef herr_t(* | HDF5CloseFunction )(hid_t) |
| The signature for the HDF5 close functions. | |
|
typedef HDF5ConstAttributes < HDF5Object > | HDF5ConstDataSetAttributes |
| typedef vector< HDF5File > | HDF5ConstFiles |
|
typedef HDF5ConstAttributes < HDF5Object > | HDF5ConstGroupAttributes |
| typedef vector< HDF5Group > | HDF5ConstGroups |
| typedef vector< HDF5File > | HDF5Files |
|
typedef HDF5MutableAttributes < HDF5ConstGroup > | HDF5GroupAttributes |
| typedef vector< HDF5Group > | HDF5Groups |
|
typedef vector < IntermediateParticleConstFactory > | IntermediateParticleConstFactories |
|
typedef vector < IntermediateParticleConst > | IntermediateParticleConsts |
|
typedef vector < IntermediateParticleFactory > | IntermediateParticleFactories |
|
typedef vector < IntermediateParticle > | IntermediateParticles |
|
typedef vector < JournalArticleConstFactory > | JournalArticleConstFactories |
|
typedef vector < JournalArticleConst > | JournalArticleConsts |
|
typedef vector < JournalArticleFactory > | JournalArticleFactories |
| typedef vector< JournalArticle > | JournalArticles |
| typedef vector< NodeConstHandle > | NodeConstHandles |
| typedef vector< NodeHandle > | NodeHandles |
|
typedef vector < ParticleConstFactory > | ParticleConstFactories |
| typedef vector< ParticleConst > | ParticleConsts |
| typedef vector< ParticleFactory > | ParticleFactories |
| typedef vector< Particle > | Particles |
|
typedef vector < ResidueConstFactory > | ResidueConstFactories |
| typedef vector< ResidueConst > | ResidueConsts |
| typedef vector< ResidueFactory > | ResidueFactories |
| typedef vector< Residue > | Residues |
|
typedef vector < RigidParticleConstFactory > | RigidParticleConstFactories |
|
typedef vector < RigidParticleConst > | RigidParticleConsts |
|
typedef vector < RigidParticleFactory > | RigidParticleFactories |
| typedef vector< RigidParticle > | RigidParticles |
| typedef vector< ScoreConstFactory > | ScoreConstFactories |
| typedef vector< ScoreConst > | ScoreConsts |
| typedef vector< ScoreFactory > | ScoreFactories |
| typedef vector< Score > | Scores |
|
typedef vector < SegmentConstFactory > | SegmentConstFactories |
| typedef vector< SegmentConst > | SegmentConsts |
| typedef vector< SegmentFactory > | SegmentFactories |
| typedef vector< Segment > | Segments |
|
typedef vector < StaticAliasConstFactory > | StaticAliasConstFactories |
| typedef vector< StaticAliasConst > | StaticAliasConsts |
|
typedef vector < StaticAliasFactory > | StaticAliasFactories |
| typedef vector< StaticAlias > | StaticAliass |
| typedef vector< TypedConstFactory > | TypedConstFactories |
| typedef vector< TypedConst > | TypedConsts |
| typedef vector< TypedFactory > | TypedFactories |
| typedef vector< Typed > | Typeds |
Enumerations | |
| enum | Compression { GZIP_COMPRESSION, SLIB_COMPRESSION, NO_COMPRESSION } |
| enum | NodeSetType { BOND, CUSTOM_SET } |
| The types of the nodes. More... | |
| enum | NodeType { REPRESENTATION, GEOMETRY, FEATURE, ALIAS, CUSTOM } |
| The types of the nodes. More... | |
Functions | |
| RMFEXPORT NodeHandle | add_child_alias (NodeHandle parent, NodeConstHandle alias) |
| RMFEXPORT void | copy_frame (FileConstHandle input, FileHandle output, unsigned int inframe, unsigned int outframe) |
| RMFEXPORT void | copy_structure (FileConstHandle input, FileHandle output) |
| RMFEXPORT HDF5File | create_hdf5_file (std::string name) |
| RMFEXPORT FileHandle | create_rmf_file (std::string path) |
| RMFEXPORT std::string | get_as_node_name (std::string input) |
| RMFEXPORT NodeHandles | get_children_resolving_aliases (NodeHandle nh) |
| RMFEXPORT NodeConstHandles | get_children_resolving_aliases (NodeConstHandle nh) |
| RMFEXPORT bool | get_equal_frame (FileConstHandle input, FileConstHandle out, unsigned int inframe, unsigned int outframe, bool print_diff=false) |
| RMFEXPORT bool | get_equal_structure (FileConstHandle input, FileConstHandle output, bool print_diff=false) |
| RMFEXPORT std::string | get_example_path (std::string file_name) |
| Return the path to installed example data for this module. | |
| RMFEXPORT int | get_number_of_open_hdf5_handles (HDF5ConstFile f=HDF5ConstFile()) |
| RMFEXPORT Strings | get_open_hdf5_handle_names (HDF5ConstFile f=HDF5ConstFile()) |
| RMFEXPORT NodeConstHandles | get_particles_by_resolution (NodeConstHandle h, double resolution, int frame=0) |
| RMFEXPORT std::string | get_set_type_name (NodeSetType t) |
| RMFEXPORT std::string | get_type_name (NodeType t) |
| RMFEXPORT HDF5File | open_hdf5_file (std::string name) |
| RMFEXPORT HDF5ConstFile | open_hdf5_file_read_only (std::string name) |
| RMFEXPORT FileHandle | open_rmf_file (std::string path) |
| RMFEXPORT FileConstHandle | open_rmf_file_read_only (std::string path) |
| void | set_show_hdf5_errors (bool tf) |
| RMFEXPORT void | show_hierarchy (NodeConstHandle root, bool verbose=false, unsigned int frame=0, std::ostream &out=std::cout) |
| RMFEXPORT void | show_hierarchy_with_decorators (NodeConstHandle root, bool verbose=false, unsigned int frame=0, std::ostream &out=std::cout) |
| typedef herr_t(* RMF::HDF5CloseFunction)(hid_t) |
The signature for the HDF5 close functions.
| typedef vector<HDF5File> RMF::HDF5ConstFiles |
| typedef vector<HDF5Group> RMF::HDF5ConstGroups |
| typedef vector<HDF5File> RMF::HDF5Files |
| typedef vector<HDF5Group> RMF::HDF5Groups |
| enum RMF::Compression |
Data sets can be compressed using one of several algorithms.
| enum RMF::NodeSetType |
| enum RMF::NodeType |
The types of the nodes.
| RMFEXPORT NodeHandle RMF::add_child_alias | ( | NodeHandle | parent, |
| NodeConstHandle | alias | ||
| ) |
Add a child to the node that is an alias to another node.
| RMFEXPORT void RMF::copy_frame | ( | FileConstHandle | input, |
| FileHandle | output, | ||
| unsigned int | inframe, | ||
| unsigned int | outframe | ||
| ) |
Copy the hierarchy structure and set structure from one rmf file to another.
| RMFEXPORT void RMF::copy_structure | ( | FileConstHandle | input, |
| FileHandle | output | ||
| ) |
Copy the hierarchy structure and set structure from one rmf file to another.
| RMFEXPORT HDF5File RMF::create_hdf5_file | ( | std::string | name | ) |
Create a new hdf5 file, clearing any existing file with the same name if needed. The file cannot already be open.
| RMFEXPORT FileHandle RMF::create_rmf_file | ( | std::string | path | ) |
Create an RMF from a file system path.
| RMFEXPORT std::string RMF::get_as_node_name | ( | std::string | input | ) |
Node names have to obey certain rules, such as no quotes in the name. This returns a string that has been modified to obey the rules.
| String RMF::get_bond_data_data_set_name | ( | ) |
Get the name of the data set for storing bonds.
| String RMF::get_category_name_data_set_name | ( | int | arity | ) |
Get the name of the data set for storing category names.
| RMFEXPORT NodeHandles RMF::get_children_resolving_aliases | ( | NodeHandle | nh | ) |
| RMFEXPORT NodeConstHandles RMF::get_children_resolving_aliases | ( | NodeConstHandle | nh | ) |
Aliases are nodes that refer to other nodes. Resolving them can result in a graph that is no longer a tree or even a DAG.
| String RMF::get_data_data_set_name | ( | std::string | category_name, |
| int | arity, | ||
| String | type_name, | ||
| bool | per_frame | ||
| ) |
Get the name of the data set for storing a particular type of data.
| RMFEXPORT bool RMF::get_equal_frame | ( | FileConstHandle | input, |
| FileConstHandle | out, | ||
| unsigned int | inframe, | ||
| unsigned int | outframe, | ||
| bool | print_diff = false |
||
| ) |
Return true of the two have the same structure.
| RMFEXPORT bool RMF::get_equal_structure | ( | FileConstHandle | input, |
| FileConstHandle | output, | ||
| bool | print_diff = false |
||
| ) |
Return true of the two have the same structure.
| RMFEXPORT std::string RMF::get_example_path | ( | std::string | file_name | ) |
Return the path to installed example data for this module.
Return the path to an example file from the name.
Each module has its own example directory, so be sure to use the version of this function in the correct module. For example to read the file example_protein.pdb located in the examples directory of the IMP::atom module, do
This will ensure that the code works when IMP is installed or used via the tools/imppy.sh script.
| String RMF::get_key_list_data_set_name | ( | std::string | category_name, |
| int | Arity, | ||
| String | type_name, | ||
| bool | per_frame | ||
| ) |
Get the name of the attribute which lists all the keys of the category.
| String RMF::get_node_data_data_set_name | ( | ) |
Get the name of the data set storing the data about each node.
| String RMF::get_node_name_data_set_name | ( | ) |
Get the name of the data set storing the name for each node.
| RMFEXPORT int RMF::get_number_of_open_hdf5_handles | ( | HDF5ConstFile | f = HDF5ConstFile() | ) |
For debugging, one can get the number of open hdf5 handles for either one file, or the whole system.
| RMFEXPORT Strings RMF::get_open_hdf5_handle_names | ( | HDF5ConstFile | f = HDF5ConstFile() | ) |
For debugging you can get the names of open handles in either one file or the whole process.
| RMFEXPORT NodeConstHandles RMF::get_particles_by_resolution | ( | NodeConstHandle | h, |
| double | resolution, | ||
| int | frame = 0 |
||
| ) |
Return a list of Particle NodeHandles that forms a slice through the tree and whose radii are as close as possible to the passed resolution.
| String RMF::get_set_data_data_set_name | ( | int | arity | ) |
Get the name of the data set for storing bonds.
| RMFEXPORT std::string RMF::get_set_type_name | ( | NodeSetType | t | ) |
Return a string version of the type name.
| RMFEXPORT std::string RMF::get_type_name | ( | NodeType | t | ) |
Return a string version of the type name.
| RMFEXPORT HDF5File RMF::open_hdf5_file | ( | std::string | name | ) |
Open an existing hdf5 file. The file cannot already be open/.
| RMFEXPORT HDF5ConstFile RMF::open_hdf5_file_read_only | ( | std::string | name | ) |
Open an existing hdf5 file read only. The file cannot already be open.
| RMFEXPORT FileHandle RMF::open_rmf_file | ( | std::string | path | ) |
Open an RMF from a file system path.
| RMFEXPORT FileConstHandle RMF::open_rmf_file_read_only | ( | std::string | path | ) |
Open an RMF from a file system path.
| void RMF::set_show_hdf5_errors | ( | bool | tf | ) |
Turn on and off printing of hdf5 error messages. They can help in diagnostics, but, for the moment, can only be output to standard error and so are off by default.
| RMFEXPORT void RMF::show_hierarchy | ( | NodeConstHandle | root, |
| bool | verbose = false, |
||
| unsigned int | frame = 0, |
||
| std::ostream & | out = std::cout |
||
| ) |
Print out the hierarchy as an ascii tree.
| RMFEXPORT void RMF::show_hierarchy_with_decorators | ( | NodeConstHandle | root, |
| bool | verbose = false, |
||
| unsigned int | frame = 0, |
||
| std::ostream & | out = std::cout |
||
| ) |
Print out the hierarchy as an ascii tree marking what decorators apply where.