Skip to content

Commit e60bfea

Browse files
committed
Explanation for AFPChain Data Model
Complete the section about the AFPChain data model.
1 parent 509057f commit e60bfea

1 file changed

Lines changed: 42 additions & 9 deletions

File tree

structure/alignmentcode.md

Lines changed: 42 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3,20 +3,53 @@ Structure Alignment in BioJava
33

44
## Data Structures
55

6-
### Legacy AFPChain model
6+
### AFPChain Data Model (legacy)
77

8-
Pairwise structure alignments are currently stored in the `AFPChain` class. The
9-
class functions as a bean, and contains many variables used internally by
10-
various alignment algorithms.
8+
The `AFPChain` data structure was designed to store pairwise structural
9+
alignments. The class functions as a bean, and contains many variables
10+
used internally by the alignment algorithms implemented in biojava.
1111

12-
### Proposed MultipleStructureAlignment data model
12+
The residue equivalencies of the alignment are described in the optimal
13+
alignment variable, a triple array of integers, where the indices stand for:
14+
15+
```java
16+
int[][][] optAln = new int[block][chain][eqr];
17+
```
18+
19+
* **block**: the blocks divide the alignment into different parts. The division
20+
can be due to non-topological rearrangements (e.g. circular permutations) or
21+
due to flexible parts (e.g. domain switch). There can be any number of blocks
22+
in a structural alignment, defined by the structure alignment algorithm.
23+
24+
* **chain**: in a pairwise alignment there are only two chains, or structures.
25+
26+
* **eqr**: EQR stands for equivalent residue position, i.e. the alignment position.
27+
There are as many positions (EQRs) in a block as the length of the alignment block,
28+
and their number is equal for any of the two chains in the same block.
29+
30+
In each entry (combination of the three indices described above) an integer is
31+
stored, which corresponds to the residue index in the specified chain, i.e. the
32+
Atom index in the chain atom array. In between the same block, the stored integers
33+
(residues) are always in increasing order.
34+
35+
As an overview, the AFPChain data model:
36+
37+
* Only supports **pairwise alignments**, i.e. two chains or structures aligned.
38+
* Can support **flexible alignments** and **non-topological alignments**.
39+
However, their combinatation (a flexible alignment with topological rearrangements)
40+
can not be represented, because the blocks mean either one or the other.
41+
* Can not support **non-sequential alignments**, or they would require a new block for
42+
each EQR, because sequentiality of the residues is assumed inside each block.
43+
44+
### MultipleAlignment Data Model
1345

1446
This data structure introduces a more explicit model for storing structure
15-
alignments. It is more flexible than the AFPChain model, adding support for
47+
alignments. It is a general model that supports any of the following properties,
48+
or their combination:
1649

17-
* Multiple alignments
18-
* Non-topological alignments, such as circular permutations
19-
* Mutable, while maintaining internal consistency
50+
* Multiple structures: the model is no longer restricted to pairwise alignments.
51+
* Non-topological alignments: such as circular permutations or domain rearrangements.
52+
* Flexible alignments:
2053

2154
A ***block*** is a series of aligned residues within a structure. A block must
2255
be a sequential alignment; the order of residues within the block should be

0 commit comments

Comments
 (0)