Phenotype-defining functions of multiple non-coding RNA pathways

GV Glinsky - Cell Cycle, 2008 - Taylor & Francis
Cell Cycle, 2008Taylor & Francis
One of the surprising revelations of the initial stage of the ENCODE project was the
conclusion that more than 90% of human genome is transcribed. A major component of this
vast transcriptional output is represented by highly heterogeneous families of transcripts
defined as short non-coding RNAs (sncRNAs) with no or limited protein-coding potentials.
Here we carried out the sequence homolog profiling of the 2301 human sncRNAs with
confirmed sequence identities [including 943 transintrons; 235 expressed distal intergenic …
One of the surprising revelations of the initial stage of the ENCODE project was the conclusion that more than 90% of human genome is transcribed. A major component of this vast transcriptional output is represented by highly heterogeneous families of transcripts defined as short non-coding RNAs (sncRNAs) with no or limited protein-coding potentials. Here we carried out the sequence homolog profiling of the 2301 human sncRNAs with confirmed sequence identities [including 943 transintrons; 235 expressed distal intergenic sequences (EDIS); and 1005 piRNAs] as well as > 1000 hypothetical transcripts derived from allelic variants of human SNP sequences with strong associations to human diseases or linkages to phenotypes established in genome-wide association studies. Unexpectedly, this analysis reveals a structural feature common for ~ 85% of analyzed sncRNA sequences and 488 human microRNAs. This structural feature common for multiple seemingly unrelated sncRNA pathways points to a multitude of potential functional and regulatory implications involving mechanisms of gene expression regulation, control of biogenesis, stability, and bioactivity of microRNAs, sncRNA-guided macromolecular interactions, and transcriptional basis of self/non-self discrimination by immune system. Our analysis implies that hundreds thousands of non-protein-coding transcripts are contributing to phenotype-defining regulatory and structural features of a cell. Therefore, definitions of genes as structural elements of a genome contributing to phenotypes should be expanded beyond the physical boundaries of mRNA-encoding units. We propose an information-centered model of a cell suggesting that informasomes (the RNP complexes of sncRNAs and Argonaute proteins) represent the intracellular structures which provide the increasingly complex structural framework of genomic regulatory functions in higher eukaryotes to facilitate the stochastic (random and probabilistic) rather than deterministic mode of choices in a sequence of regulatory events defining the phenotype.
Taylor & Francis Online