Why long noncoding RNAs (lncRNAs)?
99% of our genome is composed of a vast and unexplored wilderness of non-coding DNA. Within this wilderness are many thousands of poorly-understood genes named “long non-coding RNAs”, or lncRNAs. Understanding the biological and medical significance of these genes is a grand challenge in biology today.
As their name implies, lncRNAs are long RNA molecules that are unusual because they do not apparently encode any protein. It is this property (or lack of property) that distinguishes lncRNAs from the well-known messenger RNAs (mRNAs) that are described in the Central Dogma of molecular biology. mRNAs serve only to specify the production of protein. In contrast, it is thought that many lncRNAs are intrinsically functional RNAs, either through their sequence or their structure.
From available evidence, it would appear that lncRNAs play diverse roles in the cell, from the regulation of chromatin structure and gene transcription, to controlling mRNA stability and translation, to forming components of cellular structures, to binding to and regulating proteins. Several lncRNAs are known to play central roles in fundamental biological processes, exemplified by the essential X-chromosome silencing RNA, XIST. It is thought that lncRNAs, similar to proteins, are modular molecules that are composed by several independent domains. These domains are capable of interacting with protein, RNA and DNA, making lncRNA versatile regulators within the cell.
The first lncRNAs were discovered in the 1990s, and since then researchers have catalogued tens of thousands of these enigmatic genes in humans alone. Some lncRNAs are conserved through hundreds of millions of years of evolution, while others appear to arise rapidly and may help drive species-specific traits.
Why study lncRNAs?
The total number of lncRNAs in our genome remains unknown, but we suspect that it may lie in the region of 50,000 to 100,000. This is an enormous number, if one considers the number of protein-coding genes is ~19,000. To date, researchers have investigated <2000 lncRNAs, meaning that probably 98% of lncRNAs are completely uncharacterised.
The reason we should study these genes is that they represent a vast potential genetic space that may explain critical aspects of biology and disease. There is compelling evidence that hundreds of lncRNAs play important roles in cancer and can be targeted therapeutically.
Therefore, lncRNAs offers an exciting opportunity to understand disease. However their massive numbers and unique properties (low expression, poorly understood molecular mechanisms), present a scientific challenges that requires the development of a new suite of molecular and informatic tools.