Skip to content

Commit c78ff25

Browse files
committed
Merge pull request biojava#1 from josemduarte/master
New tutorials on contacts and crystal interfaces
2 parents e42737d + 4f3b37b commit c78ff25

File tree

2 files changed

+110
-0
lines changed

2 files changed

+110
-0
lines changed

structure/contact-map.md

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# Finding contacts between atoms in a protein: contact maps
2+
3+
Contacts are a useful tool to analyse protein structures. They simplify the 3-Dimensional view of the structures into a 2-Dimensional set of contacts between its atoms or its residues. The representation of the contacts in a matrix is known as the contact map. Many protein structure analysis and prediction efforts are done by using contacts. For instance they can be useful for:
4+
5+
+ development of structural alignment algorithms [Holm 1993][] [Caprara 2004][]
6+
+ automatic domain identification [Alexandrov 2003][] [Emmert-Streib 2007][]
7+
+ structural modelling by extraction of contact-based empirical potentials [Benkert 2008][]
8+
+ structure prediction via contact prediction from sequence information [Jones 2012][]
9+
10+
## Getting the contact map of a protein chain
11+
12+
This code snippet will produce the set of contacts between all C alpha atoms for chain A of PDB entry [1SMT](http://www.rcsb.org/pdb/explore.do?structureId=1SMT):
13+
14+
```java
15+
AtomCache cache = new AtomCache();
16+
StructureIO.setAtomCache(cache);
17+
18+
Structure structure = StructureIO.getStructure("1SMT");
19+
20+
Chain chain = structure.getChainByPDB("A");
21+
22+
// we want contacts between Calpha atoms only
23+
String[] atoms = {" CA "};
24+
// the distance cutoff we use is 8A
25+
AtomContactSet contacts = StructureTools.getAtomsInContact(chain, atoms, 8.0);
26+
27+
System.out.println("Total number of CA-CA contacts: "+contacts.size());
28+
29+
30+
```
31+
32+
The algorithm to find the contacts uses geometric hashing without need to calculate a full distance matrix, thus it scales nicely.
33+
34+
## Getting the contacts between two protein chains
35+
36+
One can also find the contacting atoms between two protein chains. For instance the following code finds the contacts between the first 2 chains of PDB entry [1SMT](http://www.rcsb.org/pdb/explore.do?structureId=1SMT):
37+
38+
```java
39+
AtomCache cache = new AtomCache();
40+
StructureIO.setAtomCache(cache);
41+
42+
Structure structure = StructureIO.getStructure("1SMT");
43+
44+
AtomContactSet contacts =
45+
StructureTools.getAtomsInContact(structure.getChain(0), structure.getChain(1), 5, false);
46+
47+
System.out.println("Total number of atom contacts: "+contacts.size());
48+
49+
// the list of atom contacts can be reduced to a list of contacts between groups:
50+
GroupContactSet groupContacts = new GroupContactSet(contacts);
51+
```
52+
53+
54+
See [DemoContacts](https://github.com/biojava/biojava/blob/master/biojava3-structure/src/main/java/demo/DemoContacts.java) for a fully working demo of the examples above.
55+
56+
57+
58+
[Holm 1993]: http://www.biomedcentral.com/pubmed/8377180
59+
[Caprara 2004]: http://www.biomedcentral.com/pubmed/15072687
60+
[Alexandrov 2003]: http://www.biomedcentral.com/pubmed/12584135
61+
[Emmert-Streib 2007]: http://www.biomedcentral.com/pubmed/17608939
62+
[Benkert 2008]: http://www.biomedcentral.com/pubmed/17932912
63+
[Jones 2012]: http://www.ncbi.nlm.nih.gov/pubmed/22101153

structure/crystal-contacts.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# How to find all crystal contacts in a PDB structure
2+
3+
## Why crystal contacts?
4+
5+
A protein structure is determined by X-ray diffraction from a protein crystal, i.e. an infinite lattice of molecules. Thus the end result of the diffraction experiment is a crystal lattice and not just a single molecule. However the PDB file only contains the coordinates of the Asymmetric Unit (AU), defined as the minimum unit needed to reconstruct the full crystal using symmetry operators.
6+
7+
Looking at the AU alone is not enough to understand the crystal structure. For instance the biologically relevant assembly (known as the Biological Unit) can occur through a symmetry operator that can be found looking at the crystal contacts. See for instance [1M4N](http://www.rcsb.org/pdb/explore.do?structureId=1M4N): its biological unit is a dimer that happens through a 2-fold operator and is the largest interface found in the crystal.
8+
9+
Looking at crystal contacts can also be important in order to assess the quality and reliability of the deposited PDB model: an AU can look perfectly fine but then upon reconstruction of the lattice the molecules can be clashing, which indicates that something is wrong in the model.
10+
11+
12+
## Getting the set of unique contacts in the crystal lattice
13+
14+
This code snippet will produce a list of all non-redundant interfaces present in the crystal lattice of PDB entry [1SMT](http://www.rcsb.org/pdb/explore.do?structureId=1SMT):
15+
16+
```java
17+
AtomCache cache = new AtomCache();
18+
19+
StructureIO.setAtomCache(cache);
20+
21+
Structure structure = StructureIO.getStructure("1SMT");
22+
23+
CrystalBuilder cb = new CrystalBuilder(structure);
24+
25+
// 6 is the distance cutoff to consider 2 atoms in contact
26+
StructureInterfaceList interfaces = cb.getUniqueInterfaces(6);
27+
28+
System.out.println("The crystal contains "+interfaces.size()+" unique interfaces");
29+
30+
// this calculates the buried surface areas of all interfaces and sorts them by areas
31+
interfaces.calcAsas(3000, 1, -1);
32+
33+
// we can get the largest interface in the crystal and look at its area
34+
interfaces.get(1).getTotalArea();
35+
36+
```
37+
38+
An interface is defined here as any 2 chains with at least a pair of atoms within the given distance cutoff (6 A in the example above).
39+
40+
The algorithm to find all unique interfaces in the crystal works roughly like this:
41+
+ Reconstructs the full unit cell by applying the matrix operators of the corresponding space group to the Asymmetric Unit.
42+
+ Searches all cells around the original one by applying crystal translations, if any 2 chains in that search is found to contact then the new contact is added to the final list.
43+
+ The search is performend without repeating redundant symmetry operators, making sure that if a contact is found then it is a unique contact.
44+
45+
See [DemoCrystalInterfaces](https://github.com/biojava/biojava/blob/master/biojava3-structure/src/main/java/demo/DemoCrystalInterfaces.java) for a fully working demo of the example above.
46+
47+

0 commit comments

Comments
 (0)