What are long noncoding RNAs (lncRNAs)?
Aside from well-characterised protein-coding genes, 99% of our genome comprises a vast and unexplored wilderness of non-protein-coding DNA or 'Dark Matter'. Previously thought to be non-functional 'junk', we have become aware that Dark Matter contains complex yet essential regulatory sequences that holds the keys to understanding biology and disease. Amongst the most numerous components are many thousands of poorly-understood genes that do not encode any protein - “long non-coding RNAs”, or lncRNAs. Understanding the biological and biomedical significance of lncRNAs is a Grand Challenge - and opportunity - in biology today.
As their name implies, lncRNAs are long RNA molecules that are unusual because they do not apparently encode any protein. It is this property (or lack of property) that distinguishes lncRNAs from the well-known messenger RNAs (mRNAs) that are described in the Central Dogma of molecular biology. mRNAs serve only to specify the production of protein. In contrast, it is thought that many lncRNAs are intrinsically functional RNAs, either through their sequence or their structure.
From available evidence, it would appear that lncRNAs play diverse roles in the cell, from the regulation of chromatin structure and gene transcription, to controlling mRNA stability and translation, to forming components of cellular structures, to binding to and regulating proteins, and even signalling to the outside world. Several lncRNAs are known to play central roles in fundamental biological processes, exemplified by the essential X-chromosome inactivating RNA, XIST. It is thought that lncRNAs, similar to proteins, are modular molecules that are composed by several independent domains. These domains are capable of interacting with protein, RNA and DNA, making lncRNA versatile regulators within the cell.
The first lncRNAs were discovered in the 1990s, and since then researchers have catalogued tens of thousands of these enigmatic genes in humans alone. Some lncRNAs are conserved through hundreds of millions of years of evolution, while others appear to arise rapidly and may help drive species-specific traits.
Why study lncRNAs?
The total number of lncRNAs in our genome remains unknown, but we suspect that it may lie in the region of 50,000 to 100,000. This is an enormous number, if one considers the number of protein-coding genes is ~19,000. To date, researchers have investigated <2000 lncRNAs, meaning that probably 98% of lncRNAs are completely uncharacterised.
The reason we should study these genes is that they represent a vast potential genetic space that may explain critical aspects of biology and disease. There is compelling evidence that hundreds of lncRNAs play important roles in cancer, either promoting or opposing the pathological "Hallmarks" of tumour cells. They have intriguing properties that make them excellent drug targets, namely their tumour cell-specific activities. For these reasons, we and others are developing new therapeutic strategies focussed on lncRNAs - 'LncRNA Therapeutics'.
Therefore, lncRNAs offers an exciting opportunity to understand disease. However their massive numbers and unique properties (low expression, poorly understood molecular mechanisms), present a scientific challenges that requires the development of a new suite of molecular and informatic tools.