Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
Deleted: | ||||||||
< < | ||||||||
VoteAbstractsJan2006 | ||||||||
Line: 99 to 98 | ||||||||
Microarrays represent a powerful technology that provides the ability to simultaneously measure the expression of thousands of genes. However, it is a multi-step process with numerous potential sources of variation that can compromise data analysis and interpretation if left uncontrolled, necessitating the development of quality control protocols to ensure assay consistency and high-quality data. In response to emerging standards, such as the minimum information about a microarray experiment standard, tools are required to ascertain the quality and reproducibility of results within and across studies. To this end, an intralaboratory quality control protocol for two color, spotted microarrays was developed using cDNA microarrays from in vivo and in vitro dose-response and time-course studies. The protocol combines: (i) diagnostic plots monitoring the degree of feature saturation, global feature and background intensities, and feature misalignments with (ii) plots monitoring the intensity distributions within arrays with (iii) a support vector machine (SVM) model. The protocol is applicable to any laboratory with sufficient datasets to establish historical high- and low-quality data. | ||||||||
Added: | ||||||||
> > | Bioinformatics (selected by Baharak)[B1] Sequence-based heuristics for faster annotation of non-coding RNA families![]() ![]() ![]() ![]() ![]() |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
Added: | ||||||||
> > |
VoteAbstractsJan2006
RNA Journal (suggested by Holger)[H1] Natural selection is not required to explain universal compositional patterns in rRNA secondary structure categories![]() ![]() Other (suggested by Holger)[H3] Protein similarity search under mRNA structural constraints: application to selenocysteine incorporation![]() | |||||||
BMC Bioinformatics (selected by Mirela) | ||||||||
Changed: | ||||||||
< < | [M1] An Approach for Clustering Gene Expression Data with Error Information![]() | |||||||
> > | [M1] An Approach for Clustering Gene Expression Data with Error Information![]() | |||||||
Background. Clustering of gene expression patterns is a well-studied technique for elucidating trends across large numbers of transcripts and for identifying likely co-regulated genes. Even the best clustering methods, however, are unlikely to provide meaningful results if too much of the data is unreliable. With the maturation of microarray technology, a wealth of research on statistical analysis of gene expression data has encouraged researchers to consider error and uncertainty in their microarray experiments, so that experiments are being performed increasingly with repeat spots per gene per chip and with repeat experiments. One of the challenges is to incorporate the measurement error information into downstream analyses of gene expression data, such as traditional clustering techniques. | ||||||||
Line: 11 to 34 | ||||||||
Conclusions. The additional information provided by replicate gene expression measurements is a valuable asset in effective clustering. Gene expression profiles with high errors, as determined from repeat measurements, may be unreliable and may associate with different clusters, whereas gene expression profiles with low errors can be clustered with higher specificity. Results indicate that including error information from repeat gene expression measurements can lead to significant improvements in clustering accuracy. | ||||||||
Added: | ||||||||
> > | [M2] In silico discovery of human natural antisense transcripts![]() ![]() | |||||||
Current Opinion in Structural Biology (selected by Sanja) | ||||||||
Changed: | ||||||||
< < | [S1] RNA structure: the long and short of it (review article)![]() | |||||||
> > | [S1] RNA structure: the long and short of it (review article)![]() | |||||||
The database of RNA structure has grown tremendously since the crystal structure analyses of ribosomal subunits in 2000–2001. During the past year, the trend toward determining the structure of large, complex biological RNAs has accelerated, with the analysis of three intact group I introns, A- and B-type ribonuclease P RNAs, a riboswitch–substrate complex and other structures. The growing database of RNA structures, coupled with efforts directed at the standardization of nomenclature and classification of motifs, has resulted in the identification and characterization of numerous RNA secondary and tertiary structure motifs. Because a large proportion of RNA structure can now be shown to be composed of these recurring structural motifs, a view of RNA as a modular structure built from a combination of these building blocks and tertiary linkers is beginning to emerge. At the same time, however, more detailed analysis of water, metal, ligand and protein binding to RNA is revealing the effect of these moieties on folding and structure formation. The balance between the views of RNA structure either as strictly a construct of preformed building blocks linked in a limited number of ways or as a flexible polymer assuming a global fold influenced by its environment will be the focus of current and future RNA structural biology. | ||||||||
Changed: | ||||||||
< < | [S2]Structure, folding and mechanisms of rybozymes (review article)![]() | |||||||
> > | [S2] Structure, folding and mechanisms of rybozymes (review article)![]() | |||||||
The past two years have seen exciting developments in RNA catalysis. A completely new ribozyme (possibly two) has come along and several new structures have been determined, including three different group I intron species. Although the origins of catalysis remain incompletely understood, a significant convergence of views has happened in the past year, together with the discovery of new super-fast ribozymes. There is persuasive evidence of general acid-base chemistry in nucleolytic ribozymes, whereas catalysis of peptidyl transfer in the ribosome seems to result largely from orientation and proximity effects. Lastly, important new folding-enhancing elements have been discovered.
Science (selected by Sanja) | ||||||||
Changed: | ||||||||
< < | [S3] The Widespread Impact of Mammalian MicroRNAs on mRNA Repression and Evolution![]() | |||||||
> > | [S3] The Widespread Impact of Mammalian MicroRNAs on mRNA Repression and Evolution![]() | |||||||
Thousands of mammalian messenger RNAs are under selective pressure to maintain 7-nucleotide sites matching microRNAs (miRNAs). We found that these conserved targets are often highly expressed at developmental stages before miRNA expression and that their levels tend to fall as the miRNA that targets them begins to accumulate. Nonconserved sites, which outnumber the conserved sites 10 to 1, also mediate repression. As a consequence, genes preferentially expressed at the same time and place as a miRNA have evolved to selectively avoid sites matching the miRNA. This phenomenon of selective avoidance extends to thousands of genes and enables spatial and temporal specificities of miRNAs to be revealed by finding tissues and developmental stages in which messages with corresponding sites are expressed at lower levels.
PLoS Computational Biology (selected by Sanja) | ||||||||
Changed: | ||||||||
< < | [S4] New Maximum Likelihood Estimators for Eukaryotic Intron Evolution![]() | |||||||
> > | [S4] New Maximum Likelihood Estimators for Eukaryotic Intron Evolution![]() | |||||||
The evolution of spliceosomal introns remains poorly understood. Although many approaches have been used to infer intron evolution from the patterns of intron position conservation, the results to date have been contradictory. In this paper, we address the problem using a novel maximum likelihood method, which allows estimation of the frequency of intron insertion target sites, together with the rates of intron gain and loss. We analyzed the pattern of 10,044 introns (7,221 intron positions) in the conserved regions of 684 sets of orthologs from seven eukaryotes. We determined that there is an average of one target site per 11.86 base pairs (bp) (95% confidence interval, 9.27 to 14.39 bp). In addition, our results showed that: (i) overall intron gains are ~25% greater than intron losses, although specific patterns vary with time and lineage; (ii) parallel gains account for ~18.5% of shared intron positions; and (iii) reacquisition following loss accounts for ~0.5% of all intron positions. Our results should assist in resolving the long-standing problem of inferring the evolution of spliceosomal introns. | ||||||||
Changed: | ||||||||
< < | [S5] Genome-Wide Identification of Human Functional DNA Using a Neutral Indel Model![]() | |||||||
> > | [S5] Genome-Wide Identification of Human Functional DNA Using a Neutral Indel Model![]() | |||||||
It has become clear that a large proportion of functional DNA in the human genome does not code for protein. Identification of this non-coding functional sequence using comparative approaches is proving difficult and has previously been thought to require deep sequencing of multiple vertebrates. Here we introduce a new model and comparative method that, instead of nucleotide substitutions, uses the evolutionary imprint of insertions and deletions (indels) to infer the past consequences of selection. The model predicts the distribution of indels under neutrality, and shows an excellent fit to human–mouse ancestral repeat data. Across the genome, many unusually long ungapped regions are detected that are unaccounted for by the neutral model, and which we predict to be highly enriched in functional DNA that has been subject to purifying selection with respect to indels. We use the model to determine the proportion under indel-purifying selection to be between 2.56% and 3.25% of human euchromatin. Since annotated protein-coding genes comprise only 1.2% of euchromatin, these results lend further weight to the proposition that more than half the functional complement of the human genome is non-protein-coding. The method is surprisingly powerful at identifying selected sequence using only two or three mammalian genomes. Applying the method to the human, mouse, and dog genomes, we identify 90 Mb of human sequence under indel-purifying selection, at a predicted 10% false-discovery rate and 75% sensitivity. As expected, most of the identified sequence represents unannotated material, while the recovered proportions of known protein-coding and microRNA genes closely match the predicted sensitivity of the method. The method's high sensitivity to functional sequence such as microRNAs suggest that as yet unannotated microRNA genes are enriched among the sequences identified. Futhermore, its independence of substitutions allowed us to identify sequence that has been subject to heterogeneous selection, that is, sequence subject to both positive selection with respect to substitutions and purifying selection with respect to indels. The ability to identify elements under heterogeneous selection enables, for the first time, the genome-wide investigation of positive selection on functional elements other than protein-coding genes. | ||||||||
Changed: | ||||||||
< < | [S6]Ten Simple Rules for Getting Published![]() | |||||||
> > | [S6]Ten Simple Rules for Getting Published![]() | |||||||
NAR (selected by Dan) |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
BMC Bioinformatics (selected by Mirela) | ||||||||
Line: 45 to 45 | ||||||||
[S6]Ten Simple Rules for Getting Published![]() | ||||||||
Added: | ||||||||
> > |
NAR (selected by Dan)[D1] Application of a superword array in genome assembly![]() ![]() ![]() ![]() |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
Deleted: | ||||||||
< < | ||||||||
BMC Bioinformatics (selected by Mirela)[M1] An Approach for Clustering Gene Expression Data with Error Information![]() | ||||||||
Line: 13 to 11 | ||||||||
Conclusions. The additional information provided by replicate gene expression measurements is a valuable asset in effective clustering. Gene expression profiles with high errors, as determined from repeat measurements, may be unreliable and may associate with different clusters, whereas gene expression profiles with low errors can be clustered with higher specificity. Results indicate that including error information from repeat gene expression measurements can lead to significant improvements in clustering accuracy. | ||||||||
Added: | ||||||||
> > |
Current Opinion in Structural Biology (selected by Sanja)[S1] RNA structure: the long and short of it (review article)![]() ![]() Science (selected by Sanja)[S3] The Widespread Impact of Mammalian MicroRNAs on mRNA Repression and Evolution![]() PLoS Computational Biology (selected by Sanja)[S4] New Maximum Likelihood Estimators for Eukaryotic Intron Evolution![]() ![]() ![]() |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Added: | ||||||||
> > |
BMC Bioinformatics (selected by Mirela)[M1] An Approach for Clustering Gene Expression Data with Error Information![]() |