![]() |
![]() |
||||
![]() |
|||||
![]() |
![]() |
|
|||
![]() |
|
![]() |
Project Highlights 1. Protein structure prediction from NMR experiments.
We have developed AMR system that fully automatically decides the structure
of a protein from NMR experiments. Initially experiments of AMR were performed
on 4 small proteins with success.
Methods for
reliable synthesis of long genes offer great promise for
protein synthesis via expression of synthetic genes, with
applications to improved analysis of protein structure
and function, as well as engineering of novel proteins.
Current technologies for gene synthesis use computational
methods for design of short oligos, which can then be
reliably synthesized and assembled into the desired
target gene. We have developed efficient algorithms for
special cases of this problem, and have shown that the
general problem is NP-hard.
The
accuracy of secondary structure predictions made by free
energy minimization is limited by the quality of the
energy parameters in the underlying free energy model.
The most widely used model, the Turner99 model, has
hundreds of parameters, and so a robust parameter
estimation scheme should efficiently handle large data
sets with thousands of structures. Moreover, the
estimation scheme should also be trained using available
experimental free energy data in addition to structural
data. We have developed a new constraint generation (CG)
method, the first computational approach to RNA free
energy parameter estimation that can be efficiently
trained on large sets of structural as well as
thermodynamic data. Our CG approach employs a novel
iterative scheme, whereby the energy values are first
computed as the solution to a constrained optimization
problem. Then the newly computed energy parameters are
used to update the constraints on the optimization
function, so as to better optimize the energy parameters
in the next iteration. Using our method on biologically
sound data, we obtain revised parameters for the Turner99
energy model. We show that by using our new parameters,
we obtain significant improvements in prediction accuracy
over current state of-the-art
methods. Improving
the accuracy and efficiency of prediction methods is an
ongoing challenge, particulary for pseudoknotted
secondary structures, in which base pairs overlap.
State-of-the-art methods, which are based on free energy
minimization, have high run-time complexity (typically
Theta(n^5) or worse), and can handle (minimize over) only
limited types of pseudoknotted structures. We have
developed Hfold, a new approach for prediction of
pseudoknotted structures, motivated by the hypothesis
that RNA structures fold hierarchically, with pseudoknot
free (non-overlapping) base pairs forming first, and
pseudoknots forming later so as to minimize energy
relative to the folded pseudoknot free structure. Our
H-fold algorithm uses two-phase energy minimization to
predict hierarchically-formed secondary structures in
O(n^3) time, matching the complexity of the best
algorithms for pseudoknot free secondary structure
prediction via enery minimization. |