@@ -56,23 +56,32 @@ Currently there exist two alternatives to parse the secondary structure in **Bio
5656files of deposited structures (author assignment) or from the output file of a DSSP prediction. Both file types
5757can be obtained from the PDB serevers, if available, so they can be automatically fetched by BioJava.
5858
59+ As an example,you can find here the links of the structure ** 5PTI** to its
60+ [ PDB file] ( http://www.rcsb.org/pdb/files/5PTI.pdb ) (search for the HELIX and SHEET lines) and its
61+ [ DSSP file] ( http://www.rcsb.org/pdb/files/5PTI.dssp ) .
62+
5963Note that the DSSP prediction output is more detailed and complete than the authors assignment.
6064The choice of one or the other will depend on the use case.
6165
6266Below you can find some examples of how to parse and assign the SS of a ` Structure ` :
6367
6468``` java
69+ String pdbID = " 5pti" ;
6570 FileParsingParameters params = new FileParsingParameters ();
66- params. setParseSecStruc(true );
71+ // Only change needed to the normal Structure loading
72+ params. setParseSecStruc(true ); // this is false as DEFAULT
6773
6874 AtomCache cache = new AtomCache ();
6975 cache. setFileParsingParams(params);
70- cache. setUseMmCif(false );
7176
72- Structure s = cache. getStructure(" 5pti" );
77+ // The loaded Structure contains the SS assigned
78+ Structure s = cache. getStructure(pdbID);
79+
80+ // If the more detailed DSSP prediction is required call this afterwards
81+ DSSPParser . fetch(pdbID, s, true ); // Second parameter true overrides the previous SS
7382```
7483
75- For more examples search in the ** demo** package for ` DemoLoadSecStruc ` and ` DemoParseSecStruc ` .
84+ For more examples search in the ** demo** package for ` DemoLoadSecStruc ` .
7685
7786## Prediction of Secondary Structure in BioJava
7887
@@ -81,16 +90,113 @@ For more examples search in the **demo** package for `DemoLoadSecStruc` and `Dem
8190The algorithm implemented in BioJava for the prediction of SS is ` DSSP ` . It is described in the paper from
8291[ Kabsch W. & Sander C. in 1983] ( http://onlinelibrary.wiley.com/doi/10.1002/bip.360221211/abstract )
8392[ ![ pubmed] ( http://img.shields.io/badge/in-pubmed-blue.svg?style=flat )] ( http://www.ncbi.nlm.nih.gov/pubmed/6667333 ) .
93+ A brief explanation of the algorithm and the output format can be found
94+ [ here] ( http://swift.cmbi.ru.nl/gv/dssp/DSSP_3.html ) .
95+
96+ The interface is very easy: a single method, named * predict()* , calculates the SS and can assign it to the
97+ input Structure overriding any previous annotation, like in the DSSPParser. An example can be found below:
8498
8599``` java
86- GuiWrapper . display(afpChain, ca1, ca2);
87- // Or using the biojava-structure-gui module
88- StructureAlignmentDisplay . display(afpChain, ca1, ca2);
100+ String pdbID = " 5pti" ;
101+ AtomCache cache = new AtomCache ();
102+
103+ // Load structure without any SS assignment
104+ Structure s = cache. getStructure(pdbID);
105+
106+ // Predict and assign the SS of the Structure
107+ SecStrucPred ssp = new SecStrucPred (); // Instantiation needed
108+ ssp. predict(s, true ); // true assigns the SS to the Structure
89109```
90110
91- ### Data Structures
111+ BioJava Class: [ org.biojava.nbio.structure.secstruc.SecStrucPred]
112+ (http://www.biojava.org/docs/api/org/biojava/nbio/structure/secstruc/SecStrucPred.html )
113+
114+ ### Storage and Data Structures
115+
116+ Because there are different sources of SS annotation, the Sata Structure in ** BioJava** that stores SS assignments
117+ has two levels. The top level ` SecStrucInfo ` is very general and only contains two properties: ** assignment**
118+ (String describing the source of information) and ** type** the SS type.
92119
120+ However, there is an extended container ` SecStrucState ` , which is a subclass of ` SecStrucInfo ` , that stores
121+ all the information of the hydrogen bonding, turns, bends, etc. used for the SS prediction and present in the
122+ DSSP output file format. This information is only used in certain applications, and that is the reason for the
123+ more general ` SecStrucInfo ` class being used by default.
124+
125+ In order to access the SS information of a ` Structure ` , the ` SecStrucInfo ` object needs to be obtained from the
126+ ` Group ` properties. Below you find an example of how to access and print residue by residue the SS information of
127+ a ` Structure ` :
128+
129+ ``` java
130+ // This structure should have SS assigned (by any of the methods described)
131+ Structure s;
132+
133+ for (Chain c : s. getChains()) {
134+ for (Group g: c. getAtomGroups()){
135+ if (g. hasAminoAtoms()){ // Only AA store SS
136+ // Obtain the object that stores the SS
137+ SecStrucInfo ss = (SecStrucInfo ) g. getProperty(Group . SEC_STRUC );
138+ // Print information: chain+resn+name+SS
139+ System . out. println(c. getChainID()+ " " +
140+ g. getResidueNumber()+ " " +
141+ g. getPDBName()+ " -> " + ss);
142+ }
143+ }
144+ }
145+ ```
93146
147+ ### Output Formats
148+
149+ Once the SS has been assigned (either loaded or predicted), there exist in ** BioJava** some formats to visualize it:
150+
151+ - ** DSSP format** : the SS can be printed as a DSSP oputput file format, following the standards so that it can be
152+ parsed again. It is the safest way to serialize a SS annotation and recover it later, but it is probably the most
153+ complicated to visualize.
154+
155+ <pre >
156+ # RESIDUE AA STRUCTURE BP1 BP2 ACC N-H-->O O-->H-N N-H-->O O-->H-N TCO KAPPA ALPHA PHI PSI X-CA Y-CA Z-CA
157+ 1 1 A R 0 0 168 0, 0.0 54,-0.1 0, 0.0 5,-0.1 0.000 360.0 360.0 360.0 139.2 32.2 14.7 -11.8
158+ 2 2 A P > - 0 0 45 0, 0.0 3,-1.8 0, 0.0 4,-0.3 -0.194 360.0-122.0 -61.4 144.9 34.9 13.6 -9.4
159+ 3 3 A D G > S+ 0 0 122 1,-0.3 3,-1.6 2,-0.2 4,-0.2 0.790 108.3 71.4 -62.8 -28.5 35.8 10.0 -9.5
160+ 4 4 A F G > S+ 0 0 26 1,-0.3 3,-1.7 2,-0.2 -1,-0.3 0.725 83.7 70.4 -64.1 -23.3 35.0 9.7 -5.9
161+ </pre >
162+
163+ - ** FASTA format** : simple format that prints the SS type of each residue sequentially in the order of the aminoacids.
164+ It is the easiest to visualize, but the less informative of all.
165+
166+ <pre >
167+ >5PTI_SS-annotation
168+ GGGGS S EEEEEEETTTTEEEEEEE SSS SS BSSHHHHHHHH
169+ </pre >
170+
171+ - ** Helix Summary** : similar to the FASTA format, but contain also information about the helical turns.
172+
173+ <pre >
174+ 3 turn: >>><<<
175+ 4 turn: >444< >>>>XX<<<<
176+ 5 turn: >5555<
177+ SS: GGGGS S EEEEEEETTTTEEEEEEE SSS SS BSSHHHHHHHH
178+ AA: RPDFCLEPPYTGPCKARIIRYFYNAKAGLCQTFVYGGCRAKRNNFKSAEDCMRTCGGA
179+ </pre >
180+
181+ - ** Secondary Structure Elements** : another way to visualize the SS annotation is by compacting those sequential residues that share the same SS type and assigning an ID to the range. In this way, a structure can be described by
182+ a collection of helices, strands, turns, etc. and each one of the elements can be identified by an ID (i.e. helix 1 (H1), beta-strand 6 (E6), etc).
183+
184+ <pre >
185+ G1: 3 - 6
186+ S1: 7 - 7
187+ S2: 13 - 13
188+ E1: 18 - 24
189+ T1: 25 - 28
190+ E2: 29 - 35
191+ S3: 37 - 39
192+ S4: 42 - 43
193+ B1: 45 - 45
194+ S5: 46 - 47
195+ H1: 48 - 55
196+ </pre >
197+
198+ You can find examples of how to get the different file formats in the class ` DemoSecStrucPred ` in the ** demo**
199+ package.
94200
95201<!-- automatically generated footer-->
96202
0 commit comments