Getting Started with RNA-Seq

  • My NEB
  • Print
  • PDF
  • While microarrays have historically been used for RNA profiling, this technique suffers from significant shortcomings, including low accuracy and low sensitivity. In contrast, RNA sequencing, commonly referred to as RNA-seq, can deliver unbiased information about the transcriptome. This method allows the identification of exons and introns, as well as identification of the 5´ and 3´ ends of genes.

    Step 1. Creating a cDNA Library

    Several methods can be used to generate an RNA-seq library, and the details of these methods will be dependent on the platform used for high-throughput sequencing. However, there are common steps:

    • Removal: Since the majority of RNA molecules present in a cell are ribosomal RNA (rRNA), which are generally not of interest, they should be removed before making a library of the RNA of interest. Two popular options for this step are:
        • mRNA isolation: This involves targeting the polyadenylated (poly(A)) tails to ensure that non-coding RNA is separated from polyadenylated transcripts.
          rRNA depletion: Total RNA may be depleted of rRNA using a number of methods, most of which involve hybridization of oligos to the rRNA followed by removal. rRNA depletion enables subsequent sequencing of all non-rRNA molecules, and is not limited to intact mRNA molecules.
    • Fragmentation: Shearing generates fragments of an appropriate size for sequencing, and is accomplished by an RNA fragmentation step prior to reverse transcription, rather than by fragmentation of cDNA.
    • Reverse transcription and second-strand cDNA synthesis: RNA is converted into a single-stranded cDNA library via random or oligo(dT) primers and a reverse transcriptase. The resulting cDNA is then converted into double-stranded cDNA by a DNA polymerase.
    • End repair, dA-Tailing and Adaptor Ligation: End repair of the cDNA library followed by optional dA-tailing (depending on the sequencing platform to be used) is followed by ligation to 3´ & 5´ adaptors. The library is then ready for amplification and sequencing.

    Step 2. Sequencing and analysis

    High-throughput sequencing technologies generate a large number of sequence reads from a library of DNA fragments. Sequence reads are mapped against a reference genome. Software packages are available for short-read alignment, and specialized algorithms for transcriptome alignment have been developed, including TopHat and Cufflinks.

    Visit to find the full list of products available for this application.

    Important factors to consider when performing RNA-seq:

    Several methods are currently available for library preparation for RNA-seq, many of which offer simplified protocols and improved yields. However, the quality and accurate quantitation of input RNA still remains critical to ensuring successful cDNA synthesis and libraries. The following are some important factors to consider:

    • Quality of RNA sample: High-quality RNA is essential for successful cDNA library preparation, and care should be taken when handling RNA samples. As RNA is prone to degradation by ribonucleases, an RNase-free environment is essential. For tips on avoiding RNase contamination, visit RibonucleaseContamination. The RNA Integrity Number (RIN), as determined by the Agilent Bioanalyzer®, is a useful measurement of RNA quality, and a RIN of 7 or higher is ideal.
    • Quantity of RNA needed: Protocols suitable for input amounts in the low ng range are now available, and accurate quantitation of input RNA is important. Contaminants present in the sample may affect quantitation; for example, free nucleotides or other organic compounds routinely used to extract RNA will also absorb UV light near 260 nm, and will result in an over-estimation of RNA concentration when spectrophotometric methods are used.
    • Strand-specific or non-directional libraries: Unlike standard methods, directional (strand-specific) protocols for sequencing RNA provide information on the DNA strand from which the RNA strand was transcribed. This is useful for many reasons including:
        • – Identification of antisense transcripts
          – Determination of the transcribed strand of noncoding RNAs
          – Determination of expression levels of coding or noncoding overlapping transcripts
    • RNA expression uniformity: It is important to be aware that results are tissue-specific and time-dependent. Gene expression is not uniform throughout an organism’s cells, and it is strongly dependent on the tissue being assessed. In addition, gene expression levels change over the lifetime of a cell.