Contacts
Contact Us
Close

Contacts

7505 Fannin St.
Suite 610
Houston, TX 77054

+1 (713) 489-9827

partnerships@biostate.ai

Detection of Fusion Genes using Targeted RNA-Sequencing

Detection of Fusion Genes using Targeted RNA-Sequencing

In cancer research and molecular diagnostics, the diagnosis of fusion genes using targeted RNA sequencing has emerged as a transformative approach. Traditional detection methods, such as fluorescence in situ hybridization (FISH) and reverse transcription PCR (RT-PCR), offer high sensitivity. However, they are limited to detecting known fusion events. This makes them inadequate for discovering novel fusion genes.    

Targeted RNA sequencing (RNA-seq) overcomes these limitations, offering unparalleled precision, sensitivity, and scalability for identifying clinically actionable fusion genes.

This article delves into the latest advancements in targeted RNA sequencing for fusion gene detection, exploring its methodologies, advantages over traditional approaches, and real-world applications in clinical oncology and research.

The Need for Targeted RNA Sequencing for Fusion Gene Detection

Fusion genes, resulting from chromosomal rearrangements, play a critical role in oncogenesis by driving tumor progression. The challenge lies in accurately detecting these fusion events, particularly in heterogeneous tumors where expression levels vary significantly.

Limitations of Conventional Approaches

Conventional methods for detecting gene fusions have limitations in terms of sensitivity, scalability, and the ability to detect novel fusion events. These methods may miss rare fusion variants or are too resource-intensive for routine clinical use.

  • FISH: High specificity but restricted to known fusions; cannot detect novel fusion partners.
  • RT-PCR: Sensitive for targeted fusion detection but lacks scalability and cannot capture unknown fusion variants.
  • Whole-genome sequencing (WGS): Provides comprehensive coverage but is resource-intensive and cost-prohibitive for routine clinical applications.
  • Whole-transcriptome RNA-seq: Allows for global fusion detection but suffers from low sensitivity for rare fusion transcripts due to data dilution.

Targeted RNA sequencing strikes the ideal balance, enhancing sensitivity and throughput while reducing sequencing costs and bioinformatics complexity.

Cutting-Edge Techniques in Targeted RNA-Sequencing for Fusion Gene Detection

Targeted RNA sequencing has revolutionized the detection of fusion genes by enabling precise, high-throughput identification of oncogenic rearrangements. The approaches offer complementary advantages in detecting known and novel fusion events. 

1. Anchored Multiplex PCR (AMP) for Unbiased Fusion Detection

Anchored Multiplex PCR (AMP) is a targeted RNA sequencing approach that enables the identification of fusion genes without requiring prior knowledge of fusion partners. Traditional PCR-based fusion detection methods rely on predefined primer sets targeting specific fusion junctions, making them unsuitable for identifying novel or rare fusion events. 

AMP overcomes this limitation by using gene-specific primers anchored to a known exonic region of a target gene while allowing extension in an unbiased manner. This ensures comprehensive fusion detection, even in cases where one fusion partner is unknown.

The advantages are mentioned below:

  • High Sensitivity and Specificity: Unlike traditional RT-PCR, which is limited to detecting predefined fusion junctions, AMP enhances the detection of both known and previously uncharacterized fusion events.
  • Compatibility with Formalin-Fixed, Paraffin-Embedded (FFPE) Samples: AMP is well-suited for clinical oncology applications where tissue preservation methods often degrade RNA integrity.
  • Reduced False Negatives: By employing gene-specific primers and random priming techniques, AMP minimizes undetected fusion events, improving the accuracy of fusion gene diagnosis.

AMP is a core technology in widely used fusion detection assays, enabling the identification of clinically actionable fusion genes across multiple cancer types, including lung, sarcoma, and hematologic malignancies

2. Hybrid Capture-Based Enrichment for High-Coverage Detection

Hybrid capture-based enrichment is another advanced targeted RNA sequencing technique that relies on biotinylated oligonucleotide probes to selectively capture and enrich for fusion-related transcripts. 

Unlike AMP, which requires gene-specific primers, hybrid capture-based methods do not rely on PCR amplification, reducing amplification biases and improving the detection of fusion transcripts with low expression levels.

The advantages are mentioned below:

  • High Sensitivity for Low-Input RNA Samples: Hybrid capture maintains sensitivity even with degraded or low-input RNA, making it particularly effective for FFPE samples.
  • Broader Fusion Detection: By capturing entire gene regions rather than specific junctions, hybrid capture-based sequencing enables the detection of multiple fusion isoforms and complex structural rearrangements.
  • Comprehensive Analysis of Known and Novel Fusions: Unlike PCR-based techniques, which may miss unexpected fusion partners, hybrid capture ensures unbiased fusion discovery.

Hybrid capture-based enrichment is widely used in targeted RNA sequencing platforms, demonstrating efficacy in the diagnosis of fusion genes across various cancers. This approach is specifically beneficial in cases where FFPE-derived RNA samples pose challenges for amplification-based methods.

A study on pediatric Ewing sarcoma, where researchers used hybrid capture RNA sequencing to detect EWSR1-FLI1 fusions in confirmed cases. The study demonstrated that RNA sequencing identified fusion events that were earlier missed, leading to more precise molecular classification and improved treatment recommendations.

These techniques, combined with bioinformatics pipelines, drive improvements in molecular diagnostics and precision oncology.

Bioinformatics Pipelines for Enhanced Fusion Detection: STARfusion & FusionCatcher

Advanced bioinformatics pipelines play an important role in fusion gene detection by accurately identifying, filtering, and annotating fusion events from sequencing data. Two widely used algorithms, STARfusion and FusionCatcher, provide complementary approaches for detecting gene fusions with high confidence.

STARfusion: Optimized for Chimeric Read Alignment

STARfusion is a fusion detection algorithm built on the Spliced Transcripts Alignment to a Reference (STAR) RNA-seq aligner. It maps RNA sequencing reads to a reference genome and detects chimeric transcripts indicative of fusion genes.

Key features mentioned below:

  • High Accuracy in Mapping Fusion Junctions: STARfusion efficiently identifies fusion transcripts by aligning split reads across exon-exon junctions.
  • Detection of Complex Rearrangements: This tool is particularly effective at detecting large chromosomal rearrangements, including translocations, inversions, and deletions that lead to fusion events.
  • Low False Positive Rate: It incorporates stringent filtering criteria to eliminate sequencing artifacts and low-confidence fusions.

STARfusion is widely applied in clinical genomics and research, particularly for cancers such as leukemia and sarcoma, where chromosomal translocations frequently drive oncogenesis.

FusionCatcher: Ideal for Low-Purity Tumor Samples

FusionCatcher is designed for detecting fusions in RNA-seq data from tumor samples with low purity, making it particularly useful in clinical oncology settings where samples contain a mix of tumor and normal cells.

Key features mentioned below:

  • Reliable False Positive Filtering: FusionCatcher cross-references multiple databases to remove recurrent false positive fusions.
  • Effective in Low-Purity Samples: Unlike STARfusion, which performs best with high-purity RNA, FusionCatcher excels in detecting fusions in samples with significant contamination from non-tumor RNA.
  • Comprehensive Fusion Partner Identification: This tool is particularly useful for detecting fusions in complex cancers with heterogeneous genetic profiles.

FusionCatcher has been successfully implemented in diagnosis of fusion genes using targeted RNA sequencing for cancers with low tumor purity, including pancreatic, colorectal, and breast cancers. Its ability to retain sensitivity in highly heterogeneous samples makes it indispensable for fusion detection in clinical research and diagnostics.

Arriba: A High-Performance Fusion Detection Algorithm

Arriba is an advanced bioinformatics tool designed for fast and highly accurate fusion gene detection in RNA sequencing data.  Arriba excels at identifying fusion transcripts while maintaining a low false-positive rate, making it particularly suitable for clinical and research applications in oncology. 

Unlike conventional fusion detection algorithms, Arriba incorporates extensive filtering criteria, including blacklists of recurrent false fusions, read-depth-based prioritization, and annotation of oncogenic relevance.

Key features mentioned below:

  • Superior False Positive Filtering: Arriba maintains an extensive blacklist of recurrent false fusion events found in normal tissues, preventing erroneous calls.
  • Speed and Efficiency: Compared to other bioinformatics tools, Arriba is optimized for speed, making it suitable for large-scale transcriptomic studies and clinical diagnostics.
  • Oncogenic Relevance Annotation: Arriba integrates databases of known oncogenic fusion genes, prioritizing clinically relevant fusions that impact cancer biology.
  • Visualization of Fusion Events: The tool provides detailed visualization of fusion breakpoints, facilitating deeper analysis of fusion mechanisms.

Arriba has been successfully implemented in fusion detection for hematologic malignancies, sarcomas, and solid tumors, where chromosomal rearrangements drive oncogenesis. It is widely used in personalized oncology research, allowing researchers and clinicians to prioritize therapeutically actionable fusion genes.

Mechanisms of Fusion Gene Formation

Fusion genes arise due to chromosomal rearrangements, which disrupt and reconfigure genomic sequences, resulting in the formation of hybrid genes. These structural alterations frequently occur in cancers and are key drivers of oncogenesis. The primary mechanisms include:

  1. Chromosomal Translocations: A segment of one chromosome relocates to another chromosome, leading to gene fusions.

    For instance, BCR-ABL1 fusion in chronic myeloid leukemia (CML) results from a reciprocal translocation between chromosomes 9 and 22, forming the Philadelphia chromosome. This fusion encodes a constitutively active tyrosine kinase, driving unregulated cell proliferation.
  2. Chromosomal Inversions: A segment of DNA reverses orientation within the same chromosome, bringing two previously unrelated genes together.

    For instance , EML4-ALK fusion in non-small cell lung cancer (NSCLC) occurs due to an inversion on chromosome 2, leading to aberrant ALK signaling.
  3. Deletions and Insertions: Loss or gain of DNA fragments can create novel fusion genes.

    For instance, TMPRSS2-ERG fusion in prostate cancer results from a deletion event, placing ERG under the control of the androgen-responsive TMPRSS2 promoter.
  4. Non-Homologous End Joining (NHEJ) and Microhomology-Mediated Break-Induced Replication (MMBIR): NHEJ is a DNA repair mechanism that joins DNA double-strand breaks with minimal sequence homology, often leading to random fusion events. MMBIR is a replication-based mechanism that generates fusion genes by aligning short homologous sequences at breakpoints.

    For instance, FGFR3-TACC3 fusions in glioblastoma are believed to form through MMBIR-mediated rearrangements.

Fusion genes frequently activate oncogenic pathways, disrupt tumor suppressor functions, and drive resistance to therapy, making their detection essential for precision oncology.

Detection of Fusion Genes using Targeted RNA-Sequencing

Fusion genes, which result from chromosomal rearrangements such as translocations, inversions, or deletions, play a crucial role in cancer pathogenesis and therapeutic decision-making. 

Targeted RNA-sequencing (RNA-seq) has emerged as a powerful method for detecting these fusion transcripts with high specificity and sensitivity, particularly in oncology. 

This process involves rigorous sample preparation, sequencing, bioinformatics analysis, and clinical interpretation to ensure accurate identification and clinical relevance.

1. Sample Preparation

The success of fusion gene detection via targeted RNA-seq begins with meticulous sample processing to preserve RNA integrity and enhance detection sensitivity. RNA degradation, low sample input, and contamination with normal RNA are key challenges that must be mitigated through optimized protocols.

RNA Extraction

Total RNA is extracted from various biological specimens, each with different challenges:

  • Fresh-frozen tissue: Provides high-quality RNA with minimal degradation, making it the preferred choice for transcriptomic studies.
  • Formalin-fixed paraffin-embedded (FFPE) tissue: Frequently used in clinical pathology but often contains degraded RNA, requiring specialized extraction protocols to recover fragmented transcripts.
  • Blood or bone marrow aspirates: Used for detecting hematological malignancies, where high levels of normal RNA can dilute fusion transcripts, necessitating enrichment strategies.
  • Cell cultures: Provide controlled experimental conditions for studying fusion genes in vitro, ensuring sufficient RNA yield for downstream applications.

RNA integrity is crucial for reliable fusion detection, with a RNA Integrity Number (RIN) of ≥7 being optimal. However, FFPE-derived RNA often falls below this threshold, requiring targeted sequencing strategies capable of working with degraded RNA.

Biostate AI makes RNA sequencing accessible at unmatched scale and cost. We offer Total RNA-Seq services for all sample types—FFPE tissue, blood, and cell cultures. The platform covers everything: RNA extraction, library prep, sequencing, and data analysis, providing comprehensive insights for longitudinal studies, multi-organ impact, and individual differences.

Library Preparation and Target Enrichment

Once RNA is extracted, it undergoes reverse transcription to complementary DNA (cDNA), which is subsequently processed for sequencing. An essential step in this process is target enrichment, which enhances the detection of fusion transcripts by selectively isolating relevant RNA molecules before sequencing. 

Two primary enrichment strategies play a crucial role in this phase:

(A) Hybrid Capture-Based Enrichment

This method optimizes library preparation by ensuring that cDNA libraries contain enriched fusion transcripts, allowing for the broad interrogation of fusion events across multiple gene panels. 

It plays a critical role in maintaining library complexity, even when working with fragmented RNA, such as formalin-fixed, paraffin-embedded (FFPE) samples. By efficiently isolating low-abundance fusion transcripts, hybrid capture enhances the sensitivity and specificity of targeted RNA sequencing.

(B) Anchored Multiplex PCR (AMP)

AMP contributes to target enrichment by enabling precise amplification of fusion transcripts without requiring prior knowledge of fusion partners. This method ensures that fusion-containing sequences are preferentially amplified, increasing their representation in sequencing libraries while minimizing background noise. 

By generating highly focused sequencing libraries, AMP reduces sequencing costs and computational demands while maximizing the efficiency of fusion gene detection in complex tumor samples.

2. Sequencing and Bioinformatics Analysis

Once the targeted cDNA library is enriched, next-generation sequencing (NGS) is performed to generate high-throughput sequencing data. Platforms such as Illumina NovaSeq, HiSeq, and MiSeq are commonly used due to their high accuracy, short read lengths, and capacity to process multiple samples simultaneously.

NGS generates short paired-end reads, which provide detailed coverage of fusion transcripts by sequencing both ends of fragmented RNA molecules. The resulting sequencing data undergoes bioinformatics processing, which involves several computational steps to ensure accurate fusion gene detection, junction identification, and false-positive filtering.

Key Bioinformatics Pipelines for Fusion Gene Detection

Several high-precision fusion detection algorithms are used to analyze sequencing data, ensuring the accurate identification of fusion junctions while minimizing false positives:

(A) STARfusion

STARfusion, built on the Spliced Transcripts Alignment to a Reference (STAR) algorithm, detects fusion genes by:

  • Aligning chimeric reads to the reference genome and identifying breakpoints where two distinct gene sequences merge.
  • Utilizing split-read analysis, which maps partial reads to separate genomic locations, confirming the presence of a fusion junction.
  • Detecting discordant paired-end reads, where one read aligns to one gene, and the paired read aligns to a different gene, further verifying the fusion event.

It generates a fusion transcript report, listing detected fusions along with supporting read counts, fusion breakpoints, and confidence scores.

(B) FusionCatcher

FusionCatcher is optimized to detect fusion genes in heterogeneous samples with mixed RNA populations. It processes sequencing data through:

  • Mapping reads to multiple reference databases, including known fusion gene databases, normal transcriptomes, and pseudogenes to differentiate real fusions from sequencing artifacts.
  • Applying split-read and paired-end read analysis to confirm fusion events at high resolution.
  • Filtering out false positives by comparing findings against known sequencing errors, transcriptome variations, and read-through events.

This method is particularly reliable in samples with a high background of normal RNA, ensuring reliable detection even in complex datasets.

(C) Arriba

Arriba, an extension of STAR, enhances fusion detection through:

  • Chimeric read alignment and split-read analysis, similar to STARfusion, ensure high sensitivity.
  • Blacklist filtering, which removes recurrent false-positive fusions arising from misalignment or sequencing errors.
  • Oncogenic event prioritization, where detected fusions are ranked based on known oncogenic drivers, helping to distinguish biologically relevant fusions from non-pathogenic ones.
  • Incorporating read-through event correction, ensuring that gene fusions are not misclassified due to naturally occurring transcriptional overlaps.

The final output includes detailed fusion annotations, with confidence scores and transcript-level details for each detected event.

Biostate AI streamlines this process with affordable end-to-end RNA sequencing services, handling diverse sample types while delivering high-quality data—allowing researchers to focus on study design and meaningful biological insights.

3. Interpretation of Fusion Gene Results

The final step in fusion gene detection is the interpretation of sequencing results to determine oncogenic relevance and therapeutic implications:

Clinically Actionable Fusions

These fusion genes are directly targetable using precision therapies, making them critical for treatment selection in oncology.

  • ALK, ROS1, and NTRK1/2/3 fusions: Responsive to tyrosine kinase inhibitors (TKIs) such as crizotinib and entrectinib in lung and colorectal cancers.
  • RET fusions: Targeted by RET inhibitors (e.g., selpercatinib, pralsetinib) in lung and thyroid cancers.
  • FGFR fusions: Sensitive to FGFR inhibitors (e.g., erdafitinib, pemigatinib) in bladder and biliary tract cancers.

The use of targeted RNA sequencing in non-small cell lung cancer (NSCLC) has significantly improved the detection of ALK fusions in clinical settings. A study reported that ALK-positive NSCLC patients treated with ALK inhibitors such as crizotinib showed a reduction in tumor progression compared to chemotherapy-treated patients. 

RNA sequencing provided high accuracy in detecting ALK rearrangements, leading to better therapy selection and improved patient outcomes.

Prognostic and Diagnostic Value

Some fusion genes serve as diagnostic biomarkers or prognostic indicators, providing critical information about disease progression.

  • BCR-ABL1 (Chronic Myeloid Leukemia – CML): Predicts poor prognosis without TKI therapy, with imatinib significantly improving survival outcomes.
  • EWSR1-FLI1 (Ewing Sarcoma): Diagnostic marker for Ewing sarcoma, distinguishing it from other soft tissue tumors.
  • SS18-SSX (Synovial Sarcoma): Confirms the diagnosis of synovial sarcoma, a rare but aggressive soft tissue cancer.

Overcoming Challenges in Detection of Fusion Genes Using Targeted RNA-Sequencing

Detecting fusion genes using targeted RNA sequencing presents several challenges, particularly concerning false positives, sample quality, and sequencing depth. Below is a more detailed discussion of these issues and their solutions.

1. False Positives & Data Interpretation

One of the major challenges in high-throughput sequencing is the increased likelihood of false-positive fusion calls. These erroneous calls can arise due to sequencing artifacts, misaligned reads, or random transcriptomic rearrangements that do not represent true gene fusions.

To improve the accuracy of fusion gene detection, using multiple bioinformatics tools in parallel can enhance verification. STARfusion and FusionCatcher are two widely used algorithms designed to detect fusion transcripts with high confidence.

  • STARfusion relies on high-quality alignment and annotation-based filtering to reduce false positives.
  • FusionCatcher employs a more exhaustive approach, including known fusion databases and negative controls, to filter out potential artifacts.
  • Parallel analysis using both tools allows for cross-validation, ensuring that only high-confidence fusion events are reported.

By integrating these tools and applying stringent filtering criteria, researchers can significantly improve the reliability of fusion gene detection while minimizing false-positive rates.

With Biostate AI, researchers can streamline the process of calculating sample size estimates for RNA sequencing data, through streamlined and cost-effective RNA-Seq workflows, from sample processing to advanced data analysis, ensuring high-precision transcriptomic insights.

2. Low-Quality FFPE Samples

Formalin-fixed, paraffin-embedded (FFPE) samples are commonly used in clinical and research settings, but the RNA extracted from these specimens is often degraded and fragmented. This degradation poses significant challenges for amplification-based techniques such as AMP-PCR, as incomplete or damaged RNA may lead to amplification biases or failure to detect certain fusion events.

Hybrid capture-based sequencing is a preferred alternative to AMP-PCR for FFPE samples due to its ability to tolerate degraded RNA:

  • Hybrid capture enrichment uses complementary probes to selectively pull down target sequences, capturing fragmented RNA more effectively than AMP-based methods.
  • Unlike PCR amplification, which requires intact primer binding regions, hybrid capture is less affected by RNA fragmentation and degradation.
  • This approach improves the ability to detect fusion transcripts in challenging samples, making it a valuable method for retrospective and clinical studies using archived FFPE tissues.

3. Sequence Coverage & Detection Limits

Detecting fusion genes with low expression levels can be difficult, particularly when sequencing depth is insufficient. Lowly expressed fusion transcripts may be missed due to their low abundance relative to highly expressed background genes.

Optimizing sequencing depth is crucial for improving detection sensitivity:

  • Increasing read depth (~50 million reads per sample) enhances the ability to detect rare fusion transcripts.
  • Deeper sequencing provides better coverage of fusion junctions, reducing the risk of missing lowly expressed fusion events.
  • In cases where ultra-low expression fusions are of interest, deeper sequencing (e.g., 100 million reads/sample) may be necessary, depending on the complexity of the transcriptome.

Balancing sequencing depth with cost and computational efficiency is essential to achieving reliable fusion detection while maintaining feasibility for large-scale studies.

Conclusion

Accurately detecting fusion genes using targeted RNA sequencing is essential for precision oncology and molecular diagnostics. Methods such as hybrid capture enrichment, anchored multiplex PCR, and advanced bioinformatics tools enable precise fusion detection while minimizing false positives. 

Despite technological advancements, factors like RNA integrity, sequencing depth, and computational refinement remain critical for reliable results.

With Biostate AI, researchers can streamline fusion gene detection through automated, high-precision RNA-seq workflows, from sample preparation to bioinformatics analysis. This therefore ensures high-confidence diagnosis of fusion genes using targeted RNA sequencing for clinical and research applications.

Disclaimer


The information present in this article is provided only for informational purposes and should not be interpreted as medical advice. Treatment strategies, including those related to gene expression and regulatory mechanisms, should only be pursued under the guidance of a qualified healthcare professional. 

Always consult a healthcare provider or genetic counselor before making decisions about your research or any treatments based on gene expression analysis.

Frequently Asked Questions

1. Can NGS detect gene fusions? 

Yes, next-generation sequencing (NGS) can detect gene fusions by analyzing RNA or DNA. It identifies both known and novel fusion events, helping in precise cancer diagnosis and enabling personalized treatment strategies.

2. What is an RNA fusion panel? 

An RNA fusion panel is a diagnostic tool used to identify gene fusions in RNA samples. It targets specific genes associated with cancer, providing detailed insights into fusion events using targeted RNA sequencing, enhancing molecular diagnostics.

3. What is the most preferable testing option for detecting NTRK fusions? 

RNA sequencing with targeted fusion panels is the most reliable method for detecting NTRK fusions. This approach offers high sensitivity, detects all NTRK fusion types, and is crucial for personalized treatment plans in cancers with NTRK gene alterations.

Leave a Comment

Your email address will not be published. Required fields are marked *