The TOPS project is aimed at the representation/analysis of protein secondary structure at the topological level. TOPS diagrams have been introduced by D.R.Westhead, D.R.Gilbert, J.M.Thornton, D.C.Hatton and T.P.J.Flores as a formalisation of "protein cartoons" that informally have been used by biologists for some time. Currently TOPS diagrams preserve information about secondary structure elements (strands, helices), hydrogen bonds (at the strand level) and spatial orientation (as a set of chiralities). Whilst such description sometimes may be too simplistic, it has the advantage that search and comparison at the TOPS lavel can be performed much faster than using other representations (eg atomic coordinates etc.). For more detailed information visit one of the TOPS pages referenced in Links/Contacts section.
TOPS motifs
TOPS motifs are fragments of
TOPS diagrams that are shared by several proteins (domains) believed to
have some biological relationship together with some biological annotaion.
Appearance of such motif in unfamiliar protein can suggest its similarity
to the other proteins that share the same motif. The trustworthiness of
such conclusion may vary, however, it may be quite high if the motif that
has been found has few or none "false" matches among the familiar proteins.
The current motif database is based upon CATH
classification, i.e. as biologically similar are considered proteins that
share the same CATH number prefix. Hence the motifs can be used as a tool
of fast prediction of CATH number.
The predictive power of this
approach significantly improves, if instead of looking for presence or
absence of just a particular motif, the unfamiliar protein is characterized
by two sets of motifs (so called profile) - "positive" i.e. motifs that
are present in this protein and "negative" i.e. motifs that are absent.
The profile then is compared to the profiles of known classes and the prediction
is based on the results of such comparison. Unfortunately, the profile method
does not work that well, if the submitted structure has been split into
domains differently than structures which were used for profile construction.
One way to deal with this problem is to analyze all possible sub-domains of
the given structure and to report the ones, which give the best matches.
Available services
Three different comparison services are
available. These allow the submission of protein secondary structure in
PDB format and then perform the search of known topology motifs in the
submitted structure.
The Best
Motif search looks for "good" motif, which is likely to indicate
that protein belongs to a particular group. As the result the found topology
motifs are displayed together with their CATH number prefix (sorry, no
more biological annotation so far) and estimated likelihood that the prediction
is correct (the percentage of known "false" matches for the motif).
Only motifs with likelihood at least 5% are being searched for. As a rule,
the more complicated secondary structure for the submitted protein, the
more likely that "good" motifs for it will be found. However, the probability
that you will get any result (for randomly chosen protein) is only about
1/3.
The Profile search
constructs the profile of motifs for the submitted protein, which
then is compared to the profiles of known groups. As the result the prefixes
of predicted CATH groups and likelihood estimations of these predictions
are displayed. In principle you can expect better predictions (and are more likely to get any
results at all) with this search method.
The Profile search with domain finding
performs the profile search for all possible sub-domains of the submitted structure (i.e. for each pair of SSEs
it is assumed that there is a sub-domain with endpoints at these SSEs; sub-domains with length 1 are not considered).
As the result the sub-domains, which give the best predictions, are displayed together with the corresponding predictions.
Links/Contacts
For more information about TOPS projects you can visit one of the following pages: Topology Of Protein Structure, Protein Topology Home Page and Bioinformatics Research Center at Glasgow University. The TOPS motif search page is located at the Institute of Mathematics and Computer Science, University of Latvia and currently maintained by Juris Viksna.
Software
The comparison service uses
the DSSP program for determination of secondary structure from PDB file
(developed by W. Kabsch and C. Sander) and TOPS program for construction
of TOPS diagram from the secondary structure (developed by T.Flores and
D.Westhead). The software for topology motif search in TOPS diagrams has
been developed by Juris Viksna.