@@ -3,20 +3,53 @@ Structure Alignment in BioJava
33
44## Data Structures
55
6- ### Legacy AFPChain model
6+ ### AFPChain Data Model (legacy)
77
8- Pairwise structure alignments are currently stored in the ` AFPChain ` class. The
9- class functions as a bean, and contains many variables used internally by
10- various alignment algorithms.
8+ The ` AFPChain ` data structure was designed to store pairwise structural
9+ alignments. The class functions as a bean, and contains many variables
10+ used internally by the alignment algorithms implemented in biojava .
1111
12- ### Proposed MultipleStructureAlignment data model
12+ The residue equivalencies of the alignment are described in the optimal
13+ alignment variable, a triple array of integers, where the indices stand for:
14+
15+ ``` java
16+ int [][][] optAln = new int [block][chain][eqr];
17+ ```
18+
19+ * ** block** : the blocks divide the alignment into different parts. The division
20+ can be due to non-topological rearrangements (e.g. circular permutations) or
21+ due to flexible parts (e.g. domain switch). There can be any number of blocks
22+ in a structural alignment, defined by the structure alignment algorithm.
23+
24+ * ** chain** : in a pairwise alignment there are only two chains, or structures.
25+
26+ * ** eqr** : EQR stands for equivalent residue position, i.e. the alignment position.
27+ There are as many positions (EQRs) in a block as the length of the alignment block,
28+ and their number is equal for any of the two chains in the same block.
29+
30+ In each entry (combination of the three indices described above) an integer is
31+ stored, which corresponds to the residue index in the specified chain, i.e. the
32+ Atom index in the chain atom array. In between the same block, the stored integers
33+ (residues) are always in increasing order.
34+
35+ As an overview, the AFPChain data model:
36+
37+ * Only supports ** pairwise alignments** , i.e. two chains or structures aligned.
38+ * Can support ** flexible alignments** and ** non-topological alignments** .
39+ However, their combinatation (a flexible alignment with topological rearrangements)
40+ can not be represented, because the blocks mean either one or the other.
41+ * Can not support ** non-sequential alignments** , or they would require a new block for
42+ each EQR, because sequentiality of the residues is assumed inside each block.
43+
44+ ### MultipleAlignment Data Model
1345
1446This data structure introduces a more explicit model for storing structure
15- alignments. It is more flexible than the AFPChain model, adding support for
47+ alignments. It is a general model that supports any of the following properties,
48+ or their combination:
1649
17- * Multiple alignments
18- * Non-topological alignments, such as circular permutations
19- * Mutable, while maintaining internal consistency
50+ * Multiple structures: the model is no longer restricted to pairwise alignments.
51+ * Non-topological alignments: such as circular permutations or domain rearrangements.
52+ * Flexible alignments:
2053
2154A *** block*** is a series of aligned residues within a structure. A block must
2255be a sequential alignment; the order of residues within the block should be
0 commit comments