To date only 2% of lncRNAs has been functionally characterized, and little is known about their molecular mechanisms and transcript properties. Bioinformatic tools applied to the protein-coding world are generally ineffective for lncRNAs, complicating their classification and prediction of their functions. A major bottleneck is the lack of frameworks for categorising lncRNAs and to decipher how molecular interactions are encoded in their sequences at high throughput. In our lab we are interested in addressing this acute need for approaches to functionally classify lncRNAs and to identify their sequence domains.
Repetitive and transposable elements: roles in lncRNAs
The sequence domains underlying long noncoding RNA (lncRNA) activities remain largely unknown. It has been proposed that these domains can originate from neofunctionalised fragments of transposable elements (TEs), otherwise known as RIDLs (Repeat Insertion Domains of Long Noncoding RNA), although just a handful have been identified.
We are interested in distinguishing those functional RIDL instances against a numerous genomic background of neutrally-evolving TEs transcriptome-wise. We have shown evidence that a subset of TE types experience evolutionary selection in the context of lncRNA exons, and that their host lncRNAs tend to be functionally validated and associated with disease.
On the other hand, an emerging question over the last years in the lncRNA field is how specific subcellular locations of lncRNAs are encoded in their primary sequence. We hypothesise that RIDLs may play a role in directing lncRNAs to subcellular compartments. Thus, we are particularly interested on using this RIDL group to explore the relationship between TEs and lncRNA subcellular localisation. We use global localisation data from human cell lines to investigate a potential contribution of evolutionarily-conserved RIDL elements on nuclear/cytoplasmic distribution of transcripts. At present, we have shown a role for L2b, Mirb and Mirc TEs in regulating nuclear enrichment of lncRNAs.
For more information see these papers:
The RIDL hypothesis; Johnson and Guigó (2014)
Because molecular functions depend on physical interactions, and physical interactions in turn depend on co-localization, we expect subcellular localization of lncRNA to be crucial for understanding their roles and regulation in cells. Hence, we approach the categorization of lncRNAs by studying their subcellular expression by creating subcellular maps of lncRNAs in human cell lines. These maps represent the basis to study gene features, sequence domains and post-transcriptional regulation of lncRNAs with the same subcellular fate.
An important source of confusion over RNA localisation arises from how we define nuclear / cytoplasmic enrichment for polyA+ RNA. Conventionally, relative localisation is defined as the ratio of concentrations of RNA between two compartments. Converseley, absoluate localisation is defined as the ratio of the number of molecules between two compartment. We recently developed a method to infer absolute localisation from RNA-seq data, and compared the results to relative quantifications from the same cells. More information can be found in our recent article (manuscript submitted). The entire set of quantifications can be downloaded here:
- lncRNA relative quantification
- mRNA relative quantification
- lncRNA absolute quantification
- mRNA absolute quantification
We have also made relative quantification maps available to the research community in a webserver: lncATLAS (see below)
LncATLAS is an easy to use web-based visualization tool to obtain useful information about relative localization of lncRNAs in human cells based on RNA-seq.
Website is available at: http://lncatlas.crg.eu/
For a given lncRNA of interest, using lncATLAs you can:
1) Inspect the cytoplasmic-nuclear localisation of your gene of interest (GOI) in 15 different cell lines
2) Inspect the cytoplasmic-nuclear localisation of your GOI within the distribution of mRNAs and lncRNA genes
3) Inspect the localisation of your GOI at sub-compartment level: localisation values within distributions for the K562 cell line subnuclear and subcytoplasmic compartments
Usage and source of data used are described Mas-Ponte et al. (2017).