1. What CLASH data is currently available in CLASHub?
CLASHub hosts data from four species: Human, Mouse, Drosophila melanogaster, and Caenorhabditis elegans. Below is the summary of available datasets:
Human
| Sample Name | wild type (#) | Non-targeting sgRNA Control (#) | ZSWIM8 Knockout (#) | BioProject Number | SRR Number |
|---|---|---|---|---|---|
| A549 | — | 6 | 6 | PRJNA1166120 | SRR34738798, SRR34738799, SRR34738800, SRR34738801, SRR34738802, SRR34738803, SRR34738804, SRR34738805, SRR34738790, SRR34738791, SRR34738792, SRR34738793 |
| Colorectal tissue | 2 | — | — | PRJNA1166120 | SRR37216684, SRR37216685 |
| D425 | 3 | — | — | PRJNA1166120 | SRR34757946, SRR34757949, SRR34757950 |
| ES2 | — | 3 | 3 | PRJNA1166120 | SRR34757940, SRR34757941, SRR34757942, SRR34757943, SRR34757944, SRR34757945 |
| HCT116 | 5 | — | 3 | GSE164634, PRJNA1166120 | SRR13415087, SRR13415088, SRR13415089, SRR13415090, SRR13415091, SRR34757939, SRR34757947, SRR34757948 |
| HEK293T | 8 | — | — | GSE198250, PRJNA1166120 | SRR18281055, SRR18281057, SRR18281067, SRR18281068, SRR34761041, SRR34761042, SRR34761043, SRR34761044 |
| HepG2 | 3 | — | — | PRJNA1166120 | SRR34783077, SRR34783079, SRR34783080 |
| H1299 | — | 3 | 3 | PRJNA1166120 | SRR34768260, SRR34768261, SRR34768262, SRR34768263, SRR34768274, SRR34768275 |
| MB002 | — | 4 | 4 | PRJNA1166120 | SRR34783070, SRR34783071, SRR34783072, SRR34783073, SRR34783074, SRR34783075, SRR34783076, SRR34783078 |
| MDA-MB-231 | — | 6 | 6 | PRJNA1166120 | SRR30817646, SRR30817647, SRR30817648, SRR30817649, SRR30817650, SRR30817651, SRR34738794, SRR34738795, SRR34738796, SRR34738797, SRR34738806, SRR34738807 |
| OVCAR8 | — | 3 | 3 | PRJNA1166120 | SRR34768264, SRR34768265, SRR34768266, SRR34768267, SRR34768276, SRR34768277 |
| TIVE-EX-LTC | 3 | — | — | GSE101978 | SRR5876947, SRR5876948, SRR5876949 |
| T98G | — | 3 | 3 | PRJNA1166120 | SRR34743309, SRR34743310, SRR34743311, SRR34743312, SRR34743317, SRR34743318 |
| U87MG | — | 3 | 3 | PRJNA1166120 | SRR34743313, SRR34743314, SRR34743315, SRR34743316, SRR34743319, SRR34743320 |
| 501Mel | — | 3 | 3 | PRJNA1166120 | SRR34768268, SRR34768269, SRR34768270, SRR34768271, SRR34768272, SRR34768273 |
Mouse
| Sample Name | wild type (#) | Non-targeting sgRNA Control (#) | Zswim8 Knockout (#) | BioProject Number | SRR Number |
|---|---|---|---|---|---|
| HE2.1B | 6 | — | — | GSE124687 | SRR8395242, SRR8395243, SRR8395244, SRR8395245, SRR8395246, SRR8395247 |
| MEF | — | 2 | 2 | PRJNA1166120 | SRR34793109, SRR34793110, SRR34793111, SRR34793112 |
| Striatal cell | — | 4 | 4 | PRJNA1093144 | SRR28497185, SRR28497186, SRR28497189, SRR28497190, SRR2849718, 6SRR28497197, SRR28497198, SRR28497199, SRR28497200 |
| 3T12 | 3 | — | — | GSE124687 | SRR8395248, SRR8395249, SRR8395250 |
| Cortex tissue | 8 | — | — | GSE73058 | SRR2413277, SRR2413278, SRR2413282, SRR2413289, SRR2413290, SRR2413300, SRR2413301, SRR2413302 |
| Heart tissue | 2 | — | — | PRJNA1166120 | SRR34793107, SRR34793108 |
| Kidney tissue | 2 | — | — | PRJNA1166120 | SRR34793105, SRR34793106 |
Drosophila melanogaster
| Sample Name | wild type (#) | Non-targeting sgRNA Control (#) | Dora Knockout (#) | BioProject Number | SRR Number |
|---|---|---|---|---|---|
| S2 cells | — | 3 | 3 | PRJNA896239 | SRR22129325, SRR22129327, SRR22129328, SRR22129284, SRR22129287, SRR22129298 |
Caenorhabditis elegans
| Sample Name | wild type (#) | Non-targeting sgRNA Control (#) | Ebax Knockout (#) | BioProject Number | SRR Number |
|---|---|---|---|---|---|
| Embryo | — | 4 | 4 | GSE303817 | — |
| mid-L4 stage | — | 4 | — | PRJNA328816 | SRR3882724, SRR3882949, SRR3882950, SRR3882951 |
2. What Gene Expression Profile data is available in CLASHub?
Gene Expression Profile from four species: Human, Mouse, Drosophila melanogaster, and Caenorhabditis elegans. Below is the summary of available datasets:
Human
| Sample Name | wild type (#) | Non-targeting sgRNA Control (#) | ZSWIM8 Knockout (#) | BioProject Number | SRR Number |
|---|---|---|---|---|---|
| A549 | 7 | — | — | GSE263036, GSE212057, GSE199309 | SRR28535493, SRR28535494, SRR28535495, SRR21237863, SRR21237869, SRR21237879, SRR18462418 |
| D425 | 5 | — | — | GSE151810, GSE185024, GSE123760 | SRR11924485, SRR11924486, SRR16119415, SRR16119416, SRR8315029 |
| ES2 | 6 | — | — | GSE218794, GSE245778 | SRR22410790, SRR22410791, SRR22410792, SRR26439462, SRR26439463, SRR26439464 |
| HEK293T | 7 | — | — | GSE231583, GSE196043 | SRR24421974, SRR24421975, SRR24421976, SRR18074813, SRR18074814, SRR18074815, SRR18074816 |
| Hela | 7 | — | — | GSE273634, GSE218727, GSE199309 | SRR30058518, SRR30058519, SRR30058520, SRR22407570, SRR22407571, SRR22407572, SRR18462415 |
| HepG2 | 5 | — | — | GSE224980, GSE264010 | SRR28685775, SRR28685776, SRR28685777, SRR23387178, SRR23387179 |
| H1299 | 4 | — | — | GSE212057, GSE199309 | SRR21237865, SRR21237873, SRR21237881, SRR18462412 |
| K562 | 6 | — | — | GSE199309, GSE167869 | SRR18462409, SRR13800753, SRR13800754, SRR13800737, SRR13800738, SRR13800739 |
| MB002 | 5 | — | — | GSE229150 GSE261568 | SRR28341540, SRR28341541, SRR28341542,SRR28341543 |
| MCF7 | 7 | — | — | GSE195761, GSE178905, GSE163791 | SRR17944548, SRR17944549, SRR14915857, SRR14915858, SRR13296901, SRR13296902, SRR13296903 |
| MDA-MB-231 | 6 | — | — | GSE178532 | SRR11544576, SRR11544577, SRR11544578, SRR14870088, SRR14870089, SRR14870090 |
| OVCAR8 | 4 | — | — | GSE246325 | SRR26536798, SRR26536799, SRR26536802, SRR26536803 |
| T98G | 5 | — | — | GSE112241, PRJNA580150 | SRR10358029, SRR10358030, SRR10358031, SRR6881782, SRR6881783 |
| U87MG | 6 | — | — | GSE147626, GSE235568 | SRR11433766, SRR11433767, SRR11433768, SRR24991947, SRR24991948, SRR24991949 |
| 501Mel | 7 | — | — | PRJNA515302, GSE104869 | SRR8473015, SRR8473019, SRR8473020, SRR6163777, SRR6163778, SRR6163779, SRR6163780 |
Mouse
| Sample Name | wild type (#) | Non-targeting sgRNA Control (#) | Zswim8 Knockout (#) | BioProject Number | SRR Number |
|---|---|---|---|---|---|
| Eye | — | 3 | 3 | GSE231447 | SRR24391488, SRR24391489, SRR24391526, SRR24391480, SRR24391481, SRR24391536 |
| Forebrain | — | 3 | 3 | GSE231447 | SRR24391522, SRR24391523, SRR24391534, SRR24391514, SRR24391515, SRR24391547 |
| Heart | — | 3 | 3 | GSE231447 | SRR24391502, SRR24391503, SRR24391533, SRR24391510, SRR24391511, SRR24391543 |
| Hindbrain | — | 3 | 3 | GSE231447 | SRR24391520, SRR24391521, SRR24391538, SRR24391512, SRR24391513, SRR24391546 |
| Intestine | — | 3 | 3 | GSE231447 | SRR24391494, SRR24391495, SRR24391530, SRR24391486, SRR24391487, SRR24391545 |
| Kidney | — | 3 | 3 | GSE231447 | SRR24391490, SRR24391491, SRR24391531, SRR24391482, SRR24391483, SRR24391539 |
| Liver | — | 3 | 3 | GSE231447 | SRR24391492, SRR24391493, SRR24391527, SRR24391484, SRR24391485, SRR24391540 |
| Lung | — | 3 | 3 | GSE231447 | SRR24391500, SRR24391501, SRR24391532, SRR24391508, SRR24391509, SRR24391542 |
| Muscle | — | 3 | 3 | GSE231447 | SRR24391518, SRR24391519, SRR24391525, SRR24391478, SRR24391479, SRR24391535 |
| Placenta | — | 3 | 3 | GSE231447 | SRR24391516, SRR24391517, SRR24391524, SRR24391476, SRR24391477, SRR24391537 |
| Skin | — | 3 | 3 | GSE231447 | SRR24391496, SRR24391497, SRR24391528, SRR24391504, SRR24391505, SRR24391541 |
| Stomach | — | 3 | 3 | GSE231447 | SRR24391498, SRR24391499, SRR24391529, SRR24391506, SRR24391507, SRR24391544 |
| Embryonic Stem Cell | 2 | — | — | PRJEB27315 | ERR2640636, ERR2640637 |
| iNeuron | 3 | — | — | PRJEB27315 | ERR2640652, ERR2640653, ERR2640654 |
| MEF | 3 | — | — | GSE239373 | SRR25443485, SRR25443484, SRR25443483 |
| Neural Precursor | 2 | — | — | PRJEB27315 | ERR2640640, ERR2640641 |
| Striatal cell | — | 4 | 4 | PRJNA1093144 | SRR34804890, SRR34804891, SRR34804892, SRR34804893, SRR34804894, SRR34804895, SRR34804896, SRR34804897 |
Drosophila melanogaster
| Sample Name | wild type (#) | Non-targeting sgRNA Control (#) | Dora Knockout (#) | BioProject Number | SRR Number |
|---|---|---|---|---|---|
| S2 cells | 5 | — | 3 | GSE196837, | SRR18048483, SRR18048484, SRR18048425, SRR18048423, SRR18048424, SRR18048427, SRR18048468, SRR18048426 |
| 0–4 h Embryos | 4 | — | — | GSE196837 | SRR18048437, SRR18048436, SRR18048435, SRR18048446 |
| 8–12 h Embryos | 6 | — | 4 | GSE196837 | SRR18048461, SRR18048433, SRR18048512, SRR18048481, SRR18048482, SRR18048434, SRR18048499, SRR18048531, SRR18048442, SRR18048532 |
| 12–16 h Embryos | 6 | — | 4 | GSE196837 | SRR18048539, SRR18048525, SRR18048508, SRR18048459, SRR18048432, SRR18048465, SRR18048448, SRR18048497, SRR18048529, SRR18048516 |
| 16–20 h Embryos wild type | 5 | — | 4 | GSE196837 | SRR18048421, SRR18048538, SRR18048479, SRR18048463, SRR18048527, SRR18048542, SRR18048443, SRR18048495, SRR18048501 |
| Fly Non-targeting Control | — | 3 | — | PRJNA896239 | SRR22129292, SRR22129294, SRR22129296 |
Caenorhabditis elegans
| Sample Name | wild type (#) | Non-targeting sgRNA Control (#) | Ebax Knockout (#) | BioProject Number | SRR Number |
|---|---|---|---|---|---|
| Embryos | 4 | — | — | PRJNA922944 | SRR23049957, SRR23049959, SRR23049928, SRR23049954 |
| L1 | 5 | — | 2 | GSE68588, GSE262626, GSE267368 | SRR2010468, SRR2010469, SRR28479534, SRR29013568, SRR29013569, SRR29013570, SRR29013571 |
| L2 | 3 | — | — | GSE266398 | SRR28868053, SRR28868054, SRR28868055 |
| L3 | 3 | — | — | PRJNA684142 | SRR13238604, SRR13238605, SRR13238606 |
| L4 | 3 | — | — | PRJNA922944 | SRR23049963, SRR23049955, SRR23049961 |
| Adult | 4 | — | — | PRJNA922944, GSE267368 | SRR23049965, SRR23049966, SRR23049906, SRR23049937 |
3. What miRNA Expression Profile data is available in CLASHub?
microRNA Expression Profile data from four species: Human, Mouse, Drosophila melanogaster, and Caenorhabditis elegans. Below is the summary of available datasets:
Human
| Sample Name | Wild Type (#) | Non-targeting sgRNA Control (#) | ZSWIM8 Knockout (#) | BioProject Number | SRR Number |
|---|---|---|---|---|---|
| A549 | — | 3 | 3 | GSE163387 | SRR13264637, SRR13264638, SRR13264639, SRR13264640, SRR13264641, SRR13264642 |
| HEK293T | — | 3 | 3 | GSE123627, GSE158025 | SRR12650650, SRR12650651, SRR12650652, SRR12650653, SRR12650654, SRR12650655 |
| HeLa | — | 3 | 3 | GSE123627, GSE163387 | SRR13264643, SRR13264644, SRR13264645, SRR13264646, SRR13264647, SRR13264648 |
| K562 | 6 | — | 6 | GSE158025, GSE163388 | SRR12650656, SRR12650657, SRR12650658, SRR13264707, SRR13264708, SRR13264709, SRR12650659, SRR12650660, SRR12650661, SRR13264710, SRR13264711, SRR13264712 |
| MCF7 | — | 2 | 3 | GSE163388 | SRR13264649, SRR13264650, SRR13264651, SRR13264652, SRR13264653 |
Mouse
| Sample Name | wild type (#) | Non-targeting sgRNA Control (#) | Zswim8 Knockout (#) | BioProject Number | SRR Number |
|---|---|---|---|---|---|
| Brain | 3 | — | 3 | GSE235065 | SRR24941005, SRR24941026, SRR24940996, SRR24941021, SRR24941036, SRR24941000 |
| Heart | 3 | — | 3 | GSE235065 | SRR24941003, SRR24941027, SRR24940995, SRR24941022, SRR24941035, SRR24940999 |
| Kidney | 3 | — | 3 | GSE235065 | SRR24941001, SRR24940993, SRR24941029, SRR24941011, SRR24941033, SRR24941017 |
| Liver | 3 | — | 3 | GSE235065 | SRR24941004, SRR24940989, SRR24941030, SRR24941010, SRR24941032, SRR24941016 |
| Lung | 3 | — | 3 | GSE235065 | SRR24940992, SRR24940998, SRR24941018, SRR24941008, SRR24941031, SRR24941015 |
| Intestine | 3 | — | 3 | GSE235065 | SRR24941002, SRR24940994, SRR24941028, SRR24941023, SRR24941012, SRR24941034 |
| Neuron | — | 3 | 2 | GSE163387 | SRR13264632, SRR13264633, SRR13264634, SRR13264635, SRR13264636 |
| MEF | — | 6 | 6 | GSE163387, GSE158025 | SRR13264626, SRR13264627, SRR13264628, SRR12650662, SRR12650663, SRR12650664, SRR13264629, SRR13264630, SRR13264631, SRR12650665, SRR12650666, SRR12650667 |
| Stomach | 3 | — | 3 | GSE235065 | SRR24941020, SRR24941009, SRR24940990, SRR24941006, SRR24941025, SRR24941013 |
| Skin | 3 | — | 3 | GSE235065 | SRR24941019, SRR24940991, SRR24940997, SRR24941024, SRR24941007, SRR24941014 |
| Striatal cell | — | 4 | 4 | PRJNA1093144 | SRR28497187, SRR28497188, SRR28497191, SRR28497192, SRR28497193, SRR28497194, SRR28497195, SRR28497196 |
Drosophila melanogaster
| Sample Name | Wild Type (#) | Non-targeting sgRNA Control (#) | Dora Knockout (#) | BioProject Number | SRR Number |
|---|---|---|---|---|---|
| S2 cells | 3 | — | 3 | GSE163388 | SRR13264713, SRR13264714, SRR13264715, SRR13264716, SRR13264717, SRR13264718 |
Caenorhabditis elegans
| Sample Name | Wild Type (#) | Non-targeting sgRNA Control (#) | Ebax Knockout (#) | BioProject Number | Data Source |
|---|---|---|---|---|---|
| Early Embryo | 2 | — | 2 | GSE267367 | SRR29013903, SRR29013904, SRR29013905, SRR29013906 |
| Late Embryo | 2 | — | 2 | GSE267367 | SRR29013899, SRR29013900, SRR29013901, SRR29013902 |
| L1 | 4 | — | 4 | GSE267367 | SRR29013871, SRR29013872, SRR29013873, SRR29013874, SRR29013895, SRR29013896, SRR29013897, SRR29013898 |
| L2 | 2 | — | 2 | GSE267367 | SRR29013891, SRR29013892, SRR29013893, SRR29013894 |
| L3 | 2 | — | 2 | GSE267367 | SRR29013887, SRR29013888, SRR29013889, SRR29013890 |
| L4 | 5 | — | 4 | GSE267367 | SRR29013866, SRR29013867, SRR29013868, SRR29013869, SRR29013870, SRR29013883, SRR29013884, SRR29013885, SRR29013886 |
| Gravid adult | 2 | — | 2 | GSE267367 | SRR29013879, SRR29013880, SRR29013881, SRR29013882 |
| Glp-4 | 2 | — | 2 | GSE267367 | SRR29013875, SRR29013876, SRR29013877, SRR29013878 |
4. How is CLASH data analyzed in CLASHub?
Step 1: Data Upload and Input
CLASHub accepts paired-end FASTQ files or clean single-end FASTA files. Users need to provide minimal information to initiate the analysis.
1.1 Paired-end Adapter Sequences:
5′ Adapter Sequence (default): GATCGTCGGACTGTAGAACT
3′ Adapter Sequence (default): TGGAATTCTCGGGTGCCAAG
1.2 UMI Configuration: Users specify 5′ and 3′ Unique Molecular Identifier (UMI) lengths. Setting both to 0 automatically skips deduplication and UMI-trimming.
1.3 Target species: (e.g., Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans)
1.4 Output file names & Email address
Step 2: Data Preprocessing
CLASHub automatically processes the uploaded data. For paired-end FASTQ files, the preprocessing pipeline includes:
2.1 Adapter Trimming: Adapter sequences are removed using
cutadapt (v2.10)
.
2.2 Read Merging: Overlapping paired-end reads are merged using
PEAR (v0.9.6)
.
2.3 Redundancy Collapse & UMI Trimming: If UMIs are present, redundant reads are collapsed using fastx_collapser, and UMIs are trimmed. If UMIs are absent (lengths = 0), this step is bypassed.
Step 3: Genome Mapping & Peak Calling
Cleaned sequences are aligned to the reference genome.
3.1 Downsampling: To prevent memory overload, files exceeding 20 million reads are downsampled prior to mapping.
3.2 Alignment: Reads are aligned using HISAT2, sorted with SAMtools, and converted to BED format.
3.3 Peak Calling: Piranha assesses target site confidence via peak-calling to identify high-confidence binding sites.
3.4 Visualization: BigWig (bw) files are automatically generated for direct inspection of read coverage in genome browsers like IGV.
Step 4: Hybrid Identification
The cleaned data is processed to identify miRNA-target hybrids using:
4.1 hyb: Aligns reads to the reference transcript database using bowtie2.
4.2 Reference Database: Includes Ensembl genome assemblies and mature miRNAs from miRBase.
4.3 Binding Stability: Free energy (ΔG) and pairing patterns are calculated using UNAfold (v3.8).
Step 5: Conservation Score Calculation
Conservation scores assess evolutionary conservation of miRNA binding sites using phyloP tracks from the UCSC Genome Browser (e.g., g38.phyloP100way for human, mm39.phyloP35way for mouse).
Step 6: Output Results
The final output includes an HTML summary report and a detailed results table featuring miRNA Name, Pairing Pattern, Gene Info, Conservation Score, Free Energy, Transcript Annotation, Piranha Peak p-values, and Normalized Hybrid Abundance.
5. How is miRNA AQ-seq data analyzed in CLASHub?
Step 1: Data Upload and Input
Users upload miRNA sequencing data in one of three supported formats:
1.1 Paired-End FASTQ (.gz) or Single-End FASTQ (.gz): Requires adapter sequences.
1.2 Cleaned Single-End FASTA (.gz): Does not require adapter sequences.
1.3 UMI Configuration: For libraries with UMIs (e.g., AQ-seq), specify the UMI length. For standard small RNA-seq libraries (e.g., Illumina TruSeq or NEBNext) lacking UMIs, set lengths to 0.
Step 2: Data Preprocessing
CLASHub processes uploaded data to produce clean FASTA files:
2.1 Adapter Trimming: Adapters are removed using cutadapt.
2.2 Read Merging: For paired-end files, reads are merged using PEAR.
2.3 Redundancy Collapse & UMI Trimming: If UMIs are specified (>0), PCR duplicates are collapsed via fastx_collapser and UMIs trimmed. If UMI lengths are 0, these steps are automatically skipped.
Step 3: miRNA Identification and Quantification
The cleaned data is analyzed for miRNA quantification using CLASHub.py.
3.1 miRNA Mapping: The first 18 nucleotides of each trimmed read are perfectly matched to mature miRNA sequences from miRBase (Release 22.1).
3.2 Quantification: Both total miRNA expression levels and isoform-specific abundances (capturing 3′ variations) are accurately estimated.
Step 4: Output Results
The analysis generates a Total miRNA Table, an Isoform Expression Table, and a Summary HTML Report with key preprocessing and alignment metrics.
6. How is RNA-seq data analyzed in CLASHub?
The RNA-seq pipeline integrates HISAT2, StringTie, and DESeq2, with automated QC, optional Exon-Intron Split Analysis (EISA), and auto-repair mechanisms.
Step 1: Data Upload and Configuration
Users configure Adapter Sequences, UMI lengths (if applicable), Library Type (Stranded vs. Unstranded), and optionally enable EISA to distinguish post-transcriptional regulation.
Step 2: Preprocessing, Alignment & QC
2.1 Auto-Repair: Broken paired-end reads are automatically checked and repaired using repair.sh to maintain read integrity.
2.2 Trimming: Adapters and specified UMIs are removed using Cutadapt.
2.3 Alignment: Reads are aligned to the reference genome using HISAT2. Strand-specific flags are applied based on the library configuration.
2.4 Quality Check: RSeQC calculates read distribution across genomic features to verify library quality.
2.5 Sorting: SAM files are sorted to BAM using SAMtools.
Step 3: Standard Quantification
3.1 Abundance Estimation: StringTie quantifies gene expression using full Ensembl annotations to generate Transcripts Per Million (TPM).
3.2 Count Generation: The prepDE.py3 script extracts raw read counts for differential analysis.
Step 4: EISA Quantification (Optional Add-on)
If enabled, CLASHub performs parallel quantification using custom Exon-only and Intron-only GTF files (with overlapping genes excluded and boundaries masked) to generate separate count matrices for intronic and exonic reads.
Step 5: Differential Expression & Classification
5.1 Standard DE: DESeq2 calculates differential expression.
5.2 EISA Classification: If EISA is selected, changes are classified as Post-transcriptional (exons and introns diverge), Transcriptional (track together), or Ambiguous.
Step 6: Output Files
Outputs include QC reports (HTML), standard DE tables (DESeq2 output), TPM/Count matrices, and—if EISA is enabled—classification tables isolating regulatory mechanisms.
7. How is cumulative fraction curve analysis performed in CLASHub?
Step 1: Data Upload and Input
Users upload a differential gene expression CSV file containing GeneName, BaseMean, and log2FoldChange. A BaseMean threshold (default: 100) filters out low-expression transcripts to ensure robust results.
Step 2: Target Identification
Target genes are classified into two groups:
2.1 CLASH-Derived Targets: Identified via experimental CLASH data (Conserved and All targets).
2.2 TargetScan-Derived Targets: Predicted interactions extracted from TargetScan databases.
Step 3: Curve Generation and Analysis Modes
The tool compares fold change distributions between miRNA targets and non-target genes using two available modes:
3.1 Standard Analysis: Groups targets by broad conservation status.
3.2 Stringent Filtering: Narrows the analysis specifically to the top 25% of high-efficacy targets based on TargetScan Context++ scores, revealing more pronounced repression patterns.
Statistical differences between target groups and background non-targets are quantified via Mann–Whitney U tests.
Step 4: Output Results
Outputs include SVG files of the Cumulative Fraction Curves visually plotting the repression shifts, alongside a comprehensive merged CSV dataset that annotates each gene with its specific target classification (e.g., top 25% Context++, high-confidence CLASH overlaps, or non-targets).