diff --git a/structure/contact-map.md b/structure/contact-map.md new file mode 100644 index 0000000..f4127ac --- /dev/null +++ b/structure/contact-map.md @@ -0,0 +1,63 @@ +# Finding contacts between atoms in a protein: contact maps + +Contacts are a useful tool to analyse protein structures. They simplify the 3-Dimensional view of the structures into a 2-Dimensional set of contacts between its atoms or its residues. The representation of the contacts in a matrix is known as the contact map. Many protein structure analysis and prediction efforts are done by using contacts. For instance they can be useful for: + ++ development of structural alignment algorithms [Holm 1993][] [Caprara 2004][] ++ automatic domain identification [Alexandrov 2003][] [Emmert-Streib 2007][] ++ structural modelling by extraction of contact-based empirical potentials [Benkert 2008][] ++ structure prediction via contact prediction from sequence information [Jones 2012][] + +## Getting the contact map of a protein chain + +This code snippet will produce the set of contacts between all C alpha atoms for chain A of PDB entry [1SMT](http://www.rcsb.org/pdb/explore.do?structureId=1SMT): + +```java + AtomCache cache = new AtomCache(); + StructureIO.setAtomCache(cache); + + Structure structure = StructureIO.getStructure("1SMT"); + + Chain chain = structure.getChainByPDB("A"); + + // we want contacts between Calpha atoms only + String[] atoms = {" CA "}; + // the distance cutoff we use is 8A + AtomContactSet contacts = StructureTools.getAtomsInContact(chain, atoms, 8.0); + + System.out.println("Total number of CA-CA contacts: "+contacts.size()); + + +``` + +The algorithm to find the contacts uses geometric hashing without need to calculate a full distance matrix, thus it scales nicely. + +## Getting the contacts between two protein chains + +One can also find the contacting atoms between two protein chains. For instance the following code finds the contacts between the first 2 chains of PDB entry [1SMT](http://www.rcsb.org/pdb/explore.do?structureId=1SMT): + +```java + AtomCache cache = new AtomCache(); + StructureIO.setAtomCache(cache); + + Structure structure = StructureIO.getStructure("1SMT"); + + AtomContactSet contacts = + StructureTools.getAtomsInContact(structure.getChain(0), structure.getChain(1), 5, false); + + System.out.println("Total number of atom contacts: "+contacts.size()); + + // the list of atom contacts can be reduced to a list of contacts between groups: + GroupContactSet groupContacts = new GroupContactSet(contacts); +``` + + +See [DemoContacts](https://github.com/biojava/biojava/blob/master/biojava3-structure/src/main/java/demo/DemoContacts.java) for a fully working demo of the examples above. + + + +[Holm 1993]: http://www.biomedcentral.com/pubmed/8377180 +[Caprara 2004]: http://www.biomedcentral.com/pubmed/15072687 +[Alexandrov 2003]: http://www.biomedcentral.com/pubmed/12584135 +[Emmert-Streib 2007]: http://www.biomedcentral.com/pubmed/17608939 +[Benkert 2008]: http://www.biomedcentral.com/pubmed/17932912 +[Jones 2012]: http://www.ncbi.nlm.nih.gov/pubmed/22101153 diff --git a/structure/crystal-contacts.md b/structure/crystal-contacts.md new file mode 100644 index 0000000..97ba9bb --- /dev/null +++ b/structure/crystal-contacts.md @@ -0,0 +1,47 @@ +# How to find all crystal contacts in a PDB structure + +## Why crystal contacts? + +A protein structure is determined by X-ray diffraction from a protein crystal, i.e. an infinite lattice of molecules. Thus the end result of the diffraction experiment is a crystal lattice and not just a single molecule. However the PDB file only contains the coordinates of the Asymmetric Unit (AU), defined as the minimum unit needed to reconstruct the full crystal using symmetry operators. + +Looking at the AU alone is not enough to understand the crystal structure. For instance the biologically relevant assembly (known as the Biological Unit) can occur through a symmetry operator that can be found looking at the crystal contacts. See for instance [1M4N](http://www.rcsb.org/pdb/explore.do?structureId=1M4N): its biological unit is a dimer that happens through a 2-fold operator and is the largest interface found in the crystal. + +Looking at crystal contacts can also be important in order to assess the quality and reliability of the deposited PDB model: an AU can look perfectly fine but then upon reconstruction of the lattice the molecules can be clashing, which indicates that something is wrong in the model. + + +## Getting the set of unique contacts in the crystal lattice + +This code snippet will produce a list of all non-redundant interfaces present in the crystal lattice of PDB entry [1SMT](http://www.rcsb.org/pdb/explore.do?structureId=1SMT): + +```java + AtomCache cache = new AtomCache(); + + StructureIO.setAtomCache(cache); + + Structure structure = StructureIO.getStructure("1SMT"); + + CrystalBuilder cb = new CrystalBuilder(structure); + + // 6 is the distance cutoff to consider 2 atoms in contact + StructureInterfaceList interfaces = cb.getUniqueInterfaces(6); + + System.out.println("The crystal contains "+interfaces.size()+" unique interfaces"); + + // this calculates the buried surface areas of all interfaces and sorts them by areas + interfaces.calcAsas(3000, 1, -1); + + // we can get the largest interface in the crystal and look at its area + interfaces.get(1).getTotalArea(); + +``` + +An interface is defined here as any 2 chains with at least a pair of atoms within the given distance cutoff (6 A in the example above). + +The algorithm to find all unique interfaces in the crystal works roughly like this: ++ Reconstructs the full unit cell by applying the matrix operators of the corresponding space group to the Asymmetric Unit. ++ Searches all cells around the original one by applying crystal translations, if any 2 chains in that search is found to contact then the new contact is added to the final list. ++ The search is performend without repeating redundant symmetry operators, making sure that if a contact is found then it is a unique contact. + +See [DemoCrystalInterfaces](https://github.com/biojava/biojava/blob/master/biojava3-structure/src/main/java/demo/DemoCrystalInterfaces.java) for a fully working demo of the example above. + +