DiBiG
ICBR BioinformaticsPowered by Actor, v1.0

RNAseq - Alignment and differential expression analysis

Title: GE7114-NSD2deg
Project: (none)
Started on: 12/11/2023 11:51:13
Hostname: login7.ufhpc
Run directory: /blue/licht/runs/NSD2-E1099K-Project/GE7114/GE7114-NSD2deg
Configuration GE7114-NSD2deg.conf
Table of contents:
  1. Input data
  2. Trimming and quality control
  3. Alignment to transcriptome
  4. Genome coverage
  5. Expression analysis - quantification
  6. Differential expression - protein-coding genes
  7. Differential expression - all genes
  8. Differential expression - isoform level
  9. Differential expression - combined files
  10. Alternative splicing analysis
  11. MultiQC report
  12. UCSC hub
  13. Methods summary
1. Input data
The following table summarizes the samples, conditions, and contrasts in this analysis. A readset is either a single fastq file or a pair of fastq files (for paired-end sequencing).

CategoryData
Summary of input data
Reference genome:hg38
Experimental conditions:RCH-ACV-Mut-0, RCH-ACV-Mut-UNC8732-10, RCH-ACV-Mut-UNC8884-10, RCH-ACV-WT-0, RCH-ACV-WT-UNC8732-10, RCH-ACV-WT-UNC8884-10
Contrasts:RCH-ACV-Mut-UNC8732-10 vs. RCH-ACV-Mut-0, RCH-ACV-Mut-UNC8732-10 vs. RCH-ACV-Mut-UNC8884-10, RCH-ACV-Mut-UNC8884-10 vs. RCH-ACV-Mut-0, RCH-ACV-WT-UNC8732-10 vs. RCH-ACV-WT-0, RCH-ACV-WT-UNC8732-10 vs. RCH-ACV-WT-UNC8884-10, RCH-ACV-WT-UNC8884-10 vs. RCH-ACV-WT-0, RCH-ACV-Mut-0 vs. RCH-ACV-WT-0, RCH-ACV-Mut-UNC8732-10 vs. RCH-ACV-WT-UNC8732-10, RCH-ACV-Mut-UNC8884-10 vs. RCH-ACV-WT-UNC8884-10
Number of samples18
Sequencing data data
Total number of reads:1,091,951,127
Average reads per sample:60,663,951
Table 1. Summary of input data



ConditionSampleNumber of reads% Reads
RCH-ACV-Mut-0RCH-ACV-Mut-0-Rep157,487,5145.26%
RCH-ACV-Mut-0-Rep263,183,5225.79%
RCH-ACV-Mut-0-Rep362,160,2775.69%
RCH-ACV-Mut-UNC8732-10RCH-ACV-Mut-UNC8732-10-Rep156,927,7045.21%
RCH-ACV-Mut-UNC8732-10-Rep257,243,4425.24%
RCH-ACV-Mut-UNC8732-10-Rep357,728,8905.29%
RCH-ACV-Mut-UNC8884-10RCH-ACV-Mut-UNC8884-10-Rep153,180,8074.87%
RCH-ACV-Mut-UNC8884-10-Rep261,911,7805.67%
RCH-ACV-Mut-UNC8884-10-Rep354,742,6485.01%
RCH-ACV-WT-0RCH-ACV-WT-0-Rep163,693,3535.83%
RCH-ACV-WT-0-Rep262,789,8425.75%
RCH-ACV-WT-0-Rep365,609,5926.01%
RCH-ACV-WT-UNC8732-10RCH-ACV-WT-UNC8732-10-Rep169,682,5636.38%
RCH-ACV-WT-UNC8732-10-Rep259,147,5615.42%
RCH-ACV-WT-UNC8732-10-Rep361,467,4345.63%
RCH-ACV-WT-UNC8884-10RCH-ACV-WT-UNC8884-10-Rep157,058,9485.23%
RCH-ACV-WT-UNC8884-10-Rep270,737,5456.48%
RCH-ACV-WT-UNC8884-10-Rep357,197,7055.24%
Table 2. Number of reads in each sample.

2. Trimming and quality control
The input sequences were trimmed using trimmomatic. Quality control was performed before and after trimming using FastQC. The following table provides links to the quality control reports before and after trimming, as well as the number of reads in the trimmed files.

SampleReadsetReads before trimQC before trimReads after trimQC after trim% Retained
RCH-ACV-Mut-0-Rep1RCH-ACV-Mut-0-Rep1_r157,487,514RCH-ACV-Mut-0-Rep1_S7_L004_R1_001
RCH-ACV-Mut-0-Rep1_S7_L004_R2_001
54,916,807RCH-ACV-Mut-0-Rep1_S7_L004_R1_001.trim.paired
RCH-ACV-Mut-0-Rep1_S7_L004_R2_001.trim.paired
95.53%
RCH-ACV-Mut-0-Rep2RCH-ACV-Mut-0-Rep2_r163,183,522RCH-ACV-Mut-0-Rep2_S8_L004_R1_001
RCH-ACV-Mut-0-Rep2_S8_L004_R2_001
60,115,671RCH-ACV-Mut-0-Rep2_S8_L004_R1_001.trim.paired
RCH-ACV-Mut-0-Rep2_S8_L004_R2_001.trim.paired
95.14%
RCH-ACV-Mut-0-Rep3RCH-ACV-Mut-0-Rep3_r162,160,277RCH-ACV-Mut-0-Rep3_S9_L004_R1_001
RCH-ACV-Mut-0-Rep3_S9_L004_R2_001
59,512,289RCH-ACV-Mut-0-Rep3_S9_L004_R1_001.trim.paired
RCH-ACV-Mut-0-Rep3_S9_L004_R2_001.trim.paired
95.74%
RCH-ACV-Mut-UNC8732-10-Rep1RCH-ACV-Mut-UNC8732-10-Rep1_r156,927,704RCH-ACV-Mut-UNC8732-10-Rep1_S10_L004_R1_001
RCH-ACV-Mut-UNC8732-10-Rep1_S10_L004_R2_001
54,754,762RCH-ACV-Mut-UNC8732-10-Rep1_S10_L004_R1_001.trim.paired
RCH-ACV-Mut-UNC8732-10-Rep1_S10_L004_R2_001.trim.paired
96.18%
RCH-ACV-Mut-UNC8732-10-Rep2RCH-ACV-Mut-UNC8732-10-Rep2_r157,243,442RCH-ACV-Mut-UNC8732-10-Rep2_S11_L004_R1_001
RCH-ACV-Mut-UNC8732-10-Rep2_S11_L004_R2_001
55,078,957RCH-ACV-Mut-UNC8732-10-Rep2_S11_L004_R1_001.trim.paired
RCH-ACV-Mut-UNC8732-10-Rep2_S11_L004_R2_001.trim.paired
96.22%
RCH-ACV-Mut-UNC8732-10-Rep3RCH-ACV-Mut-UNC8732-10-Rep3_r157,728,890RCH-ACV-Mut-UNC8732-10-Rep3_S12_L004_R1_001
RCH-ACV-Mut-UNC8732-10-Rep3_S12_L004_R2_001
55,233,953RCH-ACV-Mut-UNC8732-10-Rep3_S12_L004_R1_001.trim.paired
RCH-ACV-Mut-UNC8732-10-Rep3_S12_L004_R2_001.trim.paired
95.68%
RCH-ACV-Mut-UNC8884-10-Rep1RCH-ACV-Mut-UNC8884-10-Rep1_r153,180,807RCH-ACV-Mut-UNC8884-10-Rep1_S13_L004_R1_001
RCH-ACV-Mut-UNC8884-10-Rep1_S13_L004_R2_001
51,287,198RCH-ACV-Mut-UNC8884-10-Rep1_S13_L004_R1_001.trim.paired
RCH-ACV-Mut-UNC8884-10-Rep1_S13_L004_R2_001.trim.paired
96.44%
RCH-ACV-Mut-UNC8884-10-Rep2RCH-ACV-Mut-UNC8884-10-Rep2_r161,911,780RCH-ACV-Mut-UNC8884-10-Rep2_S14_L004_R1_001
RCH-ACV-Mut-UNC8884-10-Rep2_S14_L004_R2_001
59,482,593RCH-ACV-Mut-UNC8884-10-Rep2_S14_L004_R1_001.trim.paired
RCH-ACV-Mut-UNC8884-10-Rep2_S14_L004_R2_001.trim.paired
96.08%
RCH-ACV-Mut-UNC8884-10-Rep3RCH-ACV-Mut-UNC8884-10-Rep3_r154,742,648RCH-ACV-Mut-UNC8884-10-Rep3_S15_L004_R1_001
RCH-ACV-Mut-UNC8884-10-Rep3_S15_L004_R2_001
52,552,388RCH-ACV-Mut-UNC8884-10-Rep3_S15_L004_R1_001.trim.paired
RCH-ACV-Mut-UNC8884-10-Rep3_S15_L004_R2_001.trim.paired
96.00%
RCH-ACV-WT-0-Rep1RCH-ACV-WT-0-Rep1_r163,693,353RCH-ACV-WT-0-Rep1_S16_L004_R1_001
RCH-ACV-WT-0-Rep1_S16_L004_R2_001
60,954,352RCH-ACV-WT-0-Rep1_S16_L004_R1_001.trim.paired
RCH-ACV-WT-0-Rep1_S16_L004_R2_001.trim.paired
95.70%
RCH-ACV-WT-0-Rep2RCH-ACV-WT-0-Rep2_r162,789,842RCH-ACV-WT-0-Rep2_S17_L004_R1_001
RCH-ACV-WT-0-Rep2_S17_L004_R2_001
60,531,876RCH-ACV-WT-0-Rep2_S17_L004_R1_001.trim.paired
RCH-ACV-WT-0-Rep2_S17_L004_R2_001.trim.paired
96.40%
RCH-ACV-WT-0-Rep3RCH-ACV-WT-0-Rep3_r165,609,592RCH-ACV-WT-0-Rep3_S18_L004_R1_001
RCH-ACV-WT-0-Rep3_S18_L004_R2_001
62,530,671RCH-ACV-WT-0-Rep3_S18_L004_R1_001.trim.paired
RCH-ACV-WT-0-Rep3_S18_L004_R2_001.trim.paired
95.31%
RCH-ACV-WT-UNC8732-10-Rep1RCH-ACV-WT-UNC8732-10-Rep1_r169,682,563RCH-ACV-WT-UNC8732-10-Rep1_S19_L004_R1_001
RCH-ACV-WT-UNC8732-10-Rep1_S19_L004_R2_001
66,892,334RCH-ACV-WT-UNC8732-10-Rep1_S19_L004_R1_001.trim.paired
RCH-ACV-WT-UNC8732-10-Rep1_S19_L004_R2_001.trim.paired
96.00%
RCH-ACV-WT-UNC8732-10-Rep2RCH-ACV-WT-UNC8732-10-Rep2_r159,147,561RCH-ACV-WT-UNC8732-10-Rep2_S20_L004_R1_001
RCH-ACV-WT-UNC8732-10-Rep2_S20_L004_R2_001
56,710,740RCH-ACV-WT-UNC8732-10-Rep2_S20_L004_R1_001.trim.paired
RCH-ACV-WT-UNC8732-10-Rep2_S20_L004_R2_001.trim.paired
95.88%
RCH-ACV-WT-UNC8732-10-Rep3RCH-ACV-WT-UNC8732-10-Rep3_r161,467,434RCH-ACV-WT-UNC8732-10-Rep3_S21_L004_R1_001
RCH-ACV-WT-UNC8732-10-Rep3_S21_L004_R2_001
58,076,677RCH-ACV-WT-UNC8732-10-Rep3_S21_L004_R1_001.trim.paired
RCH-ACV-WT-UNC8732-10-Rep3_S21_L004_R2_001.trim.paired
94.48%
RCH-ACV-WT-UNC8884-10-Rep1RCH-ACV-WT-UNC8884-10-Rep1_r157,058,948RCH-ACV-WT-UNC8884-10-Rep1_S22_L004_R1_001
RCH-ACV-WT-UNC8884-10-Rep1_S22_L004_R2_001
54,555,014RCH-ACV-WT-UNC8884-10-Rep1_S22_L004_R1_001.trim.paired
RCH-ACV-WT-UNC8884-10-Rep1_S22_L004_R2_001.trim.paired
95.61%
RCH-ACV-WT-UNC8884-10-Rep2RCH-ACV-WT-UNC8884-10-Rep2_r170,737,545RCH-ACV-WT-UNC8884-10-Rep2_S23_L004_R1_001
RCH-ACV-WT-UNC8884-10-Rep2_S23_L004_R2_001
68,146,673RCH-ACV-WT-UNC8884-10-Rep2_S23_L004_R1_001.trim.paired
RCH-ACV-WT-UNC8884-10-Rep2_S23_L004_R2_001.trim.paired
96.34%
RCH-ACV-WT-UNC8884-10-Rep3RCH-ACV-WT-UNC8884-10-Rep3_r157,197,705RCH-ACV-WT-UNC8884-10-Rep3_S24_L004_R1_001
RCH-ACV-WT-UNC8884-10-Rep3_S24_L004_R2_001
54,897,195RCH-ACV-WT-UNC8884-10-Rep3_S24_L004_R1_001.trim.paired
RCH-ACV-WT-UNC8884-10-Rep3_S24_L004_R2_001.trim.paired
95.98%
Table 3. Number of reads in input files and links to QC reports.

The following two tables report the number of reads before and after QC in each sample and in each condition.

SampleReads before QCReads after QC% Retained
RCH-ACV-Mut-0-Rep157,487,51454,916,80795.53%
RCH-ACV-Mut-0-Rep263,183,52260,115,67195.14%
RCH-ACV-Mut-0-Rep362,160,27759,512,28995.74%
RCH-ACV-Mut-UNC8732-10-Rep156,927,70454,754,76296.18%
RCH-ACV-Mut-UNC8732-10-Rep257,243,44255,078,95796.22%
RCH-ACV-Mut-UNC8732-10-Rep357,728,89055,233,95395.68%
RCH-ACV-Mut-UNC8884-10-Rep153,180,80751,287,19896.44%
RCH-ACV-Mut-UNC8884-10-Rep261,911,78059,482,59396.08%
RCH-ACV-Mut-UNC8884-10-Rep354,742,64852,552,38896.00%
RCH-ACV-WT-0-Rep163,693,35360,954,35295.70%
RCH-ACV-WT-0-Rep262,789,84260,531,87696.40%
RCH-ACV-WT-0-Rep365,609,59262,530,67195.31%
RCH-ACV-WT-UNC8732-10-Rep169,682,56366,892,33496.00%
RCH-ACV-WT-UNC8732-10-Rep259,147,56156,710,74095.88%
RCH-ACV-WT-UNC8732-10-Rep361,467,43458,076,67794.48%
RCH-ACV-WT-UNC8884-10-Rep157,058,94854,555,01495.61%
RCH-ACV-WT-UNC8884-10-Rep270,737,54568,146,67396.34%
RCH-ACV-WT-UNC8884-10-Rep357,197,70554,897,19595.98%
Table 4. Number of reads in each sample before and after QC.



ConditionReads before QCReads after QC% Retained
RCH-ACV-Mut-0182,831,313174,544,76795.47%
RCH-ACV-Mut-UNC8732-10171,900,036165,067,67296.03%
RCH-ACV-Mut-UNC8884-10169,835,235163,322,17996.17%
RCH-ACV-WT-0192,092,787184,016,89995.80%
RCH-ACV-WT-UNC8732-10190,297,558181,679,75195.47%
RCH-ACV-WT-UNC8884-10184,994,198177,598,88296.00%
Table 5. Number of reads in each condition before and after QC.

3. Alignment to transcriptome
The input sequences were aligned to the hg38 transcriptome using 2.7.9a. The following table reports the number of alignments to the genome and the transcriptome for each sample. Please note that the number of alignments will in general be higher than the number of reads because the same read may align to multiple isoforms of the same gene. The WIG files can be uploaded to the UCSC Genome Browser as custom tracks.

SampleInput readsGenome alignmentsGenome alignment rateTranscriptome alignmentsTranscriptome alignment rateAlignment report
RCH-ACV-Mut-0-Rep154,916,807119,051,3092.1751,528,26093.83%RCH-ACV-Mut-0-Rep1.star/Log.final.out
RCH-ACV-Mut-0-Rep260,115,671131,094,9992.1856,243,51093.56%RCH-ACV-Mut-0-Rep2.star/Log.final.out
RCH-ACV-Mut-0-Rep359,512,289130,396,6402.1955,560,21693.36%RCH-ACV-Mut-0-Rep3.star/Log.final.out
RCH-ACV-Mut-UNC8732-10-Rep154,754,762118,885,5372.1751,332,52793.75%RCH-ACV-Mut-UNC8732-10-Rep1.star/Log.final.out
RCH-ACV-Mut-UNC8732-10-Rep255,078,957118,930,3752.1651,778,19194.01%RCH-ACV-Mut-UNC8732-10-Rep2.star/Log.final.out
RCH-ACV-Mut-UNC8732-10-Rep355,233,953118,122,1832.1452,162,94994.44%RCH-ACV-Mut-UNC8732-10-Rep3.star/Log.final.out
RCH-ACV-Mut-UNC8884-10-Rep151,287,198111,998,6302.1847,996,01693.58%RCH-ACV-Mut-UNC8884-10-Rep1.star/Log.final.out
RCH-ACV-Mut-UNC8884-10-Rep259,482,593128,541,6182.1655,839,12693.87%RCH-ACV-Mut-UNC8884-10-Rep2.star/Log.final.out
RCH-ACV-Mut-UNC8884-10-Rep352,552,388112,491,4492.1449,598,02994.38%RCH-ACV-Mut-UNC8884-10-Rep3.star/Log.final.out
RCH-ACV-WT-0-Rep160,954,352132,856,2682.1857,023,80993.55%RCH-ACV-WT-0-Rep1.star/Log.final.out
RCH-ACV-WT-0-Rep260,531,876130,675,1342.1656,866,60693.94%RCH-ACV-WT-0-Rep2.star/Log.final.out
RCH-ACV-WT-0-Rep362,530,671137,047,5382.1958,313,87893.26%RCH-ACV-WT-0-Rep3.star/Log.final.out
RCH-ACV-WT-UNC8732-10-Rep166,892,334145,200,1752.1762,714,01193.75%RCH-ACV-WT-UNC8732-10-Rep1.star/Log.final.out
RCH-ACV-WT-UNC8732-10-Rep256,710,740122,713,3262.1653,222,16493.85%RCH-ACV-WT-UNC8732-10-Rep2.star/Log.final.out
RCH-ACV-WT-UNC8732-10-Rep358,076,677124,770,0182.1554,665,31094.13%RCH-ACV-WT-UNC8732-10-Rep3.star/Log.final.out
RCH-ACV-WT-UNC8884-10-Rep154,555,014117,419,7912.1551,321,23894.07%RCH-ACV-WT-UNC8884-10-Rep1.star/Log.final.out
RCH-ACV-WT-UNC8884-10-Rep268,146,673147,903,9062.1763,904,48793.77%RCH-ACV-WT-UNC8884-10-Rep2.star/Log.final.out
RCH-ACV-WT-UNC8884-10-Rep354,897,195119,850,9942.1851,341,72093.52%RCH-ACV-WT-UNC8884-10-Rep3.star/Log.final.out
Table 6. Number of alignments to genome and transcriptome.

4. Genome coverage
The following table reports the overall and effective genome coverage in each sample. The Total nt column reports the total number of nucleotides sequenced, i.e. the number of aligned reads times the length of each read. Coverage is this number divided by the size of the genome. Effective bp reports the number of bases in the genome having coverage greater than 5, and the Effective Perc column shows what percentage this is of the genome size. Note that, especially in the case of RNA-seq, the effective genome size may be much smaller than the full size. Eff Coverage is the average coverage over the effectively covered fraction of the genome.

NameTotal ntCoverageEffective bpEffective PercEff Coverage
RCH-ACV-Mut-0-Rep1162,359,399,30952.61725,937,49823.50%223.65
RCH-ACV-Mut-0-Rep2173,470,474,94756.79735,440,23224.10%235.87
RCH-ACV-Mut-0-Rep3169,697,495,51354.99738,541,19923.90%229.77
RCH-ACV-Mut-UNC8732-10-Rep1162,204,263,29253.10682,408,67622.30%237.69
RCH-ACV-Mut-UNC8732-10-Rep2164,530,180,17853.86682,916,59022.40%240.92
RCH-ACV-Mut-UNC8732-10-Rep3167,233,465,09054.87686,353,09822.50%243.66
RCH-ACV-Mut-UNC8884-10-Rep1150,389,697,80149.25720,241,03723.60%208.80
RCH-ACV-Mut-UNC8884-10-Rep2175,959,448,29057.63738,459,46624.20%238.28
RCH-ACV-Mut-UNC8884-10-Rep3161,792,706,50752.97723,223,91623.70%223.71
RCH-ACV-WT-0-Rep1174,594,332,10357.18716,401,13723.50%243.71
RCH-ACV-WT-0-Rep2179,158,818,43058.67709,809,39223.20%252.40
RCH-ACV-WT-0-Rep3180,265,521,70459.00711,831,11723.30%253.24
RCH-ACV-WT-UNC8732-10-Rep1191,110,774,86161.93695,287,00822.50%274.87
RCH-ACV-WT-UNC8732-10-Rep2167,360,573,23354.81674,301,42122.10%248.20
RCH-ACV-WT-UNC8732-10-Rep3171,705,791,01556.28679,251,29022.30%252.79
RCH-ACV-WT-UNC8884-10-Rep1164,393,721,56353.84699,968,28222.90%234.86
RCH-ACV-WT-UNC8884-10-Rep2195,734,092,65264.08728,822,82723.90%268.56
RCH-ACV-WT-UNC8884-10-Rep3159,848,120,47952.33705,809,33123.10%226.47
Table 7. Genome coverage by sample.

The following table reports the overall and effective genome coverage in each condition.

NameTotal ntCoverageEffective bpEffective PercEff Coverage
RCH-ACV-Mut-0461,806,453,730149.64869,447,49928.20%531.15
RCH-ACV-Mut-UNC8732-10452,723,886,149148.20796,823,24426.10%568.16
RCH-ACV-Mut-UNC8884-10447,371,960,189144.96864,728,54828.00%517.36
RCH-ACV-WT-000.0000.00%0.00
RCH-ACV-WT-UNC8732-10482,427,070,177156.32793,655,14125.70%607.85
RCH-ACV-WT-UNC8884-10474,019,470,392155.18831,322,82527.20%570.20
Table 8. Genome coverage by condition

File: GE7114-NSD2deg.sample.cov.xlsx
Size: 46.04 kB
Description: Per-chromosome coverage data, by sample.

File: GE7114-NSD2deg.cond.cov.xlsx
Size: 16.99 kB
Description: Per-chromosome coverage data, by condition.

5. Expression analysis - quantification
Gene and transcript expression values were quantified using RSEM v1.3.1. The following files contain the raw FPKM values for all genes/transcripts in all samples. NOTE: these values are not normalized yet, please apply the appropriate normalization before using them in analysis.
File: genes.rawmatrix.csv
Size: 6.86 MB
Description: Matrix of FPKM values for all genes in all samples.

File: transcripts.rawmatrix.csv
Size: 24.45 MB
Description: Matrix of FPKM values for all transcripts in all samples.

File: genes.xpra.txt
Size: 5.89 MB
Description: Counts table suitable for ExpressAnalyst.

The following scatterplots show the level of similarity between replicates of the same condition.


Principal Component Analysis on raw (un-normalized) expression data. Click on the thumbnail to display the full-size image.

(png format, 274.38 kB)


The following image displays the Multi-Dimensional Scaling (MDS) plot for the raw (un-normalized) expression data. Click on the thumbnail to display the full-size image.
6. Differential expression - protein-coding genes
Differential gene expression was analyzed using DESeq2. The following table reports the number of differentially expressed genes in each contrast with abs(log2(FC)) >= 1.0 and FDR-corrected P-value <= 0.05. The files under the Table heading contain the log2(FC) and P-value of all significant genes, while the files under the Expressions heading contain normalized expression values for the significant genes in all replicates of the two conditions being compared. The lists of differentially expressed genes for all contrasts can also be downloaded as a single Excel file using the link below.

TestControlTotalOverexpressedUnderexpressedTableExpressions
RCH-ACV-Mut-UNC8732-10RCH-ACV-Mut-01,018289729RCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-Mut-0.codinggeneDiff.csvRCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-Mut-0.gmatrix.csv
RCH-ACV-Mut-UNC8732-10RCH-ACV-Mut-UNC8884-10997262735RCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-Mut-UNC8884-10.codinggeneDiff.csvRCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-Mut-UNC8884-10.gmatrix.csv
RCH-ACV-Mut-UNC8884-10RCH-ACV-Mut-0954RCH-ACV-Mut-UNC8884-10.vs.RCH-ACV-Mut-0.codinggeneDiff.csvRCH-ACV-Mut-UNC8884-10.vs.RCH-ACV-Mut-0.gmatrix.csv
RCH-ACV-WT-UNC8732-10RCH-ACV-WT-041051359RCH-ACV-WT-UNC8732-10.vs.RCH-ACV-WT-0.codinggeneDiff.csvRCH-ACV-WT-UNC8732-10.vs.RCH-ACV-WT-0.gmatrix.csv
RCH-ACV-WT-UNC8732-10RCH-ACV-WT-UNC8884-1040247355RCH-ACV-WT-UNC8732-10.vs.RCH-ACV-WT-UNC8884-10.codinggeneDiff.csvRCH-ACV-WT-UNC8732-10.vs.RCH-ACV-WT-UNC8884-10.gmatrix.csv
RCH-ACV-WT-UNC8884-10RCH-ACV-WT-0844RCH-ACV-WT-UNC8884-10.vs.RCH-ACV-WT-0.codinggeneDiff.csvRCH-ACV-WT-UNC8884-10.vs.RCH-ACV-WT-0.gmatrix.csv
RCH-ACV-Mut-0RCH-ACV-WT-0833599234RCH-ACV-Mut-0.vs.RCH-ACV-WT-0.codinggeneDiff.csvRCH-ACV-Mut-0.vs.RCH-ACV-WT-0.gmatrix.csv
RCH-ACV-Mut-UNC8732-10RCH-ACV-WT-UNC8732-10980668312RCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-WT-UNC8732-10.codinggeneDiff.csvRCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-WT-UNC8732-10.gmatrix.csv
RCH-ACV-Mut-UNC8884-10RCH-ACV-WT-UNC8884-10856626230RCH-ACV-Mut-UNC8884-10.vs.RCH-ACV-WT-UNC8884-10.codinggeneDiff.csvRCH-ACV-Mut-UNC8884-10.vs.RCH-ACV-WT-UNC8884-10.gmatrix.csv
Table 9. Results of gene-level differential expression analysis.

File: GE7114-NSD2deg-codingdiff.xlsx
Size: 402.48 kB
Description: Excel file containing differentially expressed genes for all contrasts (one sheet per contrast). Only includes protein-coding genes.

File: GE7114-NSD2deg-allcodingdiff.xlsx
Size: 7.81 MB
Description: Excel file containing differential expression values for all tested genes in all contrasts (one sheet per contrast). Only includes protein-coding genes. Note: genes with very low average expression in all conditions were removed.

File: GE7114-NSD2deg.g.deseq2norm.xlsx
Size: 4.16 MB
Description: Excel file containing normalized (DESeq2) expression values for all protein-coding genes in all conditions. Note: genes with very low average expression in all conditions were removed.


Principal Component Analysis on normalized expression data. Click on the thumbnail to display the full-size image.

(png format, 278.86 kB)


The following image displays the Multi-Dimensional Scaling (MDS) plot for the normalized expression data. In this plot, relative distances between samples reflect the similarity of their gene expression profiles. Ideally, replicates of the same condition should be close together, and well separated from other conditions.

Volcano plots for all contrasts. Use the menu to select a contrast.

7. Differential expression - all genes
The following table reports results from the same differential analysis as above, but includes all biotypes instead of coding genes only.

TestControlTotalOverexpressedUnderexpressedTableExpressions
RCH-ACV-Mut-UNC8732-10RCH-ACV-Mut-01,237324913RCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-Mut-0.geneDiff.csvRCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-Mut-0.gmatrix.csv
RCH-ACV-Mut-UNC8732-10RCH-ACV-Mut-UNC8884-101,196311885RCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-Mut-UNC8884-10.geneDiff.csvRCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-Mut-UNC8884-10.gmatrix.csv
RCH-ACV-Mut-UNC8884-10RCH-ACV-Mut-01367RCH-ACV-Mut-UNC8884-10.vs.RCH-ACV-Mut-0.geneDiff.csvRCH-ACV-Mut-UNC8884-10.vs.RCH-ACV-Mut-0.gmatrix.csv
RCH-ACV-WT-UNC8732-10RCH-ACV-WT-047159412RCH-ACV-WT-UNC8732-10.vs.RCH-ACV-WT-0.geneDiff.csvRCH-ACV-WT-UNC8732-10.vs.RCH-ACV-WT-0.gmatrix.csv
RCH-ACV-WT-UNC8732-10RCH-ACV-WT-UNC8884-1046353410RCH-ACV-WT-UNC8732-10.vs.RCH-ACV-WT-UNC8884-10.geneDiff.csvRCH-ACV-WT-UNC8732-10.vs.RCH-ACV-WT-UNC8884-10.gmatrix.csv
RCH-ACV-WT-UNC8884-10RCH-ACV-WT-0954RCH-ACV-WT-UNC8884-10.vs.RCH-ACV-WT-0.geneDiff.csvRCH-ACV-WT-UNC8884-10.vs.RCH-ACV-WT-0.gmatrix.csv
RCH-ACV-Mut-0RCH-ACV-WT-01,011725286RCH-ACV-Mut-0.vs.RCH-ACV-WT-0.geneDiff.csvRCH-ACV-Mut-0.vs.RCH-ACV-WT-0.gmatrix.csv
RCH-ACV-Mut-UNC8732-10RCH-ACV-WT-UNC8732-101,192791401RCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-WT-UNC8732-10.geneDiff.csvRCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-WT-UNC8732-10.gmatrix.csv
RCH-ACV-Mut-UNC8884-10RCH-ACV-WT-UNC8884-101,052751301RCH-ACV-Mut-UNC8884-10.vs.RCH-ACV-WT-UNC8884-10.geneDiff.csvRCH-ACV-Mut-UNC8884-10.vs.RCH-ACV-WT-UNC8884-10.gmatrix.csv
Table 10. Results of gene-level differential expression analysis (all biotypes).

File: GE7114-NSD2deg-genediff.xlsx
Size: 491.14 kB
Description: Excel file containing differentially expressed genes for all contrasts (one sheet per contrast). Includes all genes and pseudo-genes.

File: GE7114-NSD2deg-allgenediff.xlsx
Size: 9.84 MB
Description: Excel file containing differential expression values for all genes in all contrasts (one sheet per contrast). Includes all genes and pseudo-genes.

File: GE7114-NSD2deg-allExpressions.xlsx
Size: 4.64 MB
Description: Excel file containing normalized (RSEM) expression values for all genes in all conditions.

8. Differential expression - isoform level
The following table reports the number of differentially expressed isoforms in each contrast with abs(log2(FC)) >= 1.0 and FDR-corrected P-value <= 0.05. The lists of differentially expressed isoforms for all contrasts can also be downloaded as a single Excel file using the link below.

TestControlTot isoformsOverexpressedUnderexpressedTableExpressions
RCH-ACV-Mut-UNC8732-10RCH-ACV-Mut-02,2166701,546RCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-Mut-0.isoDiff.csvRCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-Mut-0.imatrix.csv
RCH-ACV-Mut-UNC8732-10RCH-ACV-Mut-UNC8884-102,1126521,460RCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-Mut-UNC8884-10.isoDiff.csvRCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-Mut-UNC8884-10.imatrix.csv
RCH-ACV-Mut-UNC8884-10RCH-ACV-Mut-021813385RCH-ACV-Mut-UNC8884-10.vs.RCH-ACV-Mut-0.isoDiff.csvRCH-ACV-Mut-UNC8884-10.vs.RCH-ACV-Mut-0.imatrix.csv
RCH-ACV-WT-UNC8732-10RCH-ACV-WT-0965255710RCH-ACV-WT-UNC8732-10.vs.RCH-ACV-WT-0.isoDiff.csvRCH-ACV-WT-UNC8732-10.vs.RCH-ACV-WT-0.imatrix.csv
RCH-ACV-WT-UNC8732-10RCH-ACV-WT-UNC8884-10958230728RCH-ACV-WT-UNC8732-10.vs.RCH-ACV-WT-UNC8884-10.isoDiff.csvRCH-ACV-WT-UNC8732-10.vs.RCH-ACV-WT-UNC8884-10.imatrix.csv
RCH-ACV-WT-UNC8884-10RCH-ACV-WT-018710285RCH-ACV-WT-UNC8884-10.vs.RCH-ACV-WT-0.isoDiff.csvRCH-ACV-WT-UNC8884-10.vs.RCH-ACV-WT-0.imatrix.csv
RCH-ACV-Mut-0RCH-ACV-WT-02,2121,405807RCH-ACV-Mut-0.vs.RCH-ACV-WT-0.isoDiff.csvRCH-ACV-Mut-0.vs.RCH-ACV-WT-0.imatrix.csv
RCH-ACV-Mut-UNC8732-10RCH-ACV-WT-UNC8732-102,6101,5751,035RCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-WT-UNC8732-10.isoDiff.csvRCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-WT-UNC8732-10.imatrix.csv
RCH-ACV-Mut-UNC8884-10RCH-ACV-WT-UNC8884-102,1581,407751RCH-ACV-Mut-UNC8884-10.vs.RCH-ACV-WT-UNC8884-10.isoDiff.csvRCH-ACV-Mut-UNC8884-10.vs.RCH-ACV-WT-UNC8884-10.imatrix.csv
Table 11. Results of isoform-level differential expression analysis.

File: GE7114-NSD2deg-isodiff.xlsx
Size: 1.06 MB
Description: Excel file containing differentially expressed isoforms for all contrasts (one sheet per contrast).

File: GE7114-NSD2deg-allisodiff.xlsx
Size: 41.19 MB
Description: Excel file containing differential expression values for all isoforms in all contrasts (one sheet per contrast).

9. Differential expression - combined files
The following file contains merged differential expression data. The first sheet contains fold changes for all genes that were found to be differentially expressed in at least one contrast. The second and third sheets contain the same information for coding genes only, and all transcripts.
File: GE7114-NSD2deg-merged.allDiff.xlsx
Size: 1001.41 kB
Description: Merged fold changes for all differentially expressed genes, coding genes, and transcripts respectively.

10. Alternative splicing analysis
Alternative splicing analysis was performed using rMATS version v4.1.0. The following table reports the number of events in each class for each contrast. The link in the last column allows you to download an Excel file containing full results.

TestControlSERIMXEA3SSA5SSFull
RCH-ACV-Mut-UNC8732-10RCH-ACV-Mut-0631183141136111RCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-Mut-0.MATS.xlsx
RCH-ACV-Mut-UNC8732-10RCH-ACV-Mut-UNC8884-1024090816446RCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-Mut-UNC8884-10.MATS.xlsx
RCH-ACV-Mut-UNC8884-10RCH-ACV-Mut-023772807647RCH-ACV-Mut-UNC8884-10.vs.RCH-ACV-Mut-0.MATS.xlsx
RCH-ACV-WT-UNC8732-10RCH-ACV-WT-0280123867556RCH-ACV-WT-UNC8732-10.vs.RCH-ACV-WT-0.MATS.xlsx
RCH-ACV-WT-UNC8732-10RCH-ACV-WT-UNC8884-1057616911013392RCH-ACV-WT-UNC8732-10.vs.RCH-ACV-WT-UNC8884-10.MATS.xlsx
RCH-ACV-WT-UNC8884-10RCH-ACV-WT-058216914115288RCH-ACV-WT-UNC8884-10.vs.RCH-ACV-WT-0.MATS.xlsx
RCH-ACV-Mut-0RCH-ACV-WT-063318016015890RCH-ACV-Mut-0.vs.RCH-ACV-WT-0.MATS.xlsx
RCH-ACV-Mut-UNC8732-10RCH-ACV-WT-UNC8732-10686179138144109RCH-ACV-Mut-UNC8732-10.vs.RCH-ACV-WT-UNC8732-10.MATS.xlsx
RCH-ACV-Mut-UNC8884-10RCH-ACV-WT-UNC8884-1025685825138RCH-ACV-Mut-UNC8884-10.vs.RCH-ACV-WT-UNC8884-10.MATS.xlsx
Table 12. Number of alternative splicing events, by class, for each contrast. Classes are: SE=exon skipping; RI=intron retention; MXE=mutually exclusive exons; A3SS=alternative 3' splice site; A5SS=alternative 5' splice site.

11. MultiQC report
MultiQC is a general Quality Control tool for a large number of bioinformatics pipelines. The report on this analysis (generated using MultiQC version 1.12) is available here:

MultiQC report
12. UCSC hub

UCSC Genome Browser: use the previous link to display the data tracks automatically, or copy the the URL https://bw:bw@lichtlab.cancer.ufl.edu/reports/NSD//GE7114-NSD2deg/hub/hub.txt and paste it into the "My Hubs" form in this page.

WashU EpiGenome Browser: use the previous link to display the data tracks automatically, or copy the following URL into the "Datahub by URL Link" field: https://bw:bw@lichtlab.cancer.ufl.edu/reports/NSD//GE7114-NSD2deg/hub/hub.json.

13. Methods summary

Short reads were trimmed using trimmomatic (v 0.36) [1], and QC on the original and trimmed reads was performed using FastQC (v 0.11.4) [2] and MultiQC [3].

The reads were aligned to the transcriptome using STAR version 2.7.9a [4].

Transcript abundance was quantified using RSEM (RSEM v1.3.1) [5].

Differential expression analysis was performed using DESeq2 [6], with an FDR-corrected P-value threshold of 0.05. The output files were further filtered to extract transcripts showing a 2.0-fold change in either direction. Results were reported for protein-coding genes only, and for all transcript types.

Alternative splicing analysis was performed using rMATS version v4.1.0 [7].


References

  1. Bolger, A. M., Lohse, M., and Usadel, B. (2014). Trimmomatic: A flexible trimmer for Illumina Sequence Data. Bioinformatics, btu170.
  2. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
  3. Philip Ewels, Mans Magnusson, Sverker Lundin and Max Kaller (2016). MultiQC: Summarize analysis results for multiple tools and samples in a single report. Bioinformatics | doi: 10.1093/bioinformatics/btw354 | PubMed: 27312411
  4. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 29(1):15-21 | doi: 10.1093/bioinformatics/bts635
  5. Li B and Dewey CN (2011). RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12:323 | doi: 10.1186/1471-2105-12-323
  6. Love MI, huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology, 15,550 (2014). | doi: 10.1186/s13059-014-0550-8
  7. Shen S., Park JW., Lu ZX., Lin L., Henry MD., Wu YN., Zhou Q., Xing Y. rMATS: Robust and Flexible Detection of Differential Alternative Splicing from Replicate RNA-Seq Data. PNAS, 111(51):E5593-601 | doi: 10.1073/pnas.1419161111



Completed: 12-11-2023@11:54
© 2023, A. Riva, University of Florida.