We analyzed sample E76-KP
from directory E76-KP/outs/filtered_feature_bc_matrix/
. Input dataset:
data = rs_load10Xrun(params$data_dir, project=params$sample)
## Warning: Feature names cannot have underscores ('_'), replacing with dashes
## ('-')
data
## An object of class Seurat
## 19555 features across 12096 samples within 1 assay
## Active assay: RNA (19555 features, 0 variable features)
The following violin plot shows the distribution of the number of genes, counts, and percentage of mitochondrial reads in all cells.
We filtered the dataset to remove cells with fewer than 500 or more than 20000 genes, and with a proportion of mitochondrial genes higher than 10 percent. The resulting dataset contains 11831 cells (97.8% of the initial number).
The following plots display the relationship between counts and percent mitochondrial reads, and number of genes.
## Warning: CombinePlots is being deprecated. Plots should now be combined using
## the patchwork system.
After normalization, we identified the 12
most variable genes. They are:
## [1] "TP53I3" "TAGLN" "MDM2" "DHRS2" "CCNB1" "AURKA" "NR0B1" "LCE1C"
## [9] "PVRL4" "UBE2C" "IFI27" "HMMR"
The following chart plots the standardized variance versus the average expression. Outliers represents features with high variability (the top 2000 are in red). The 12 most variable features are labeled.
## When using repel, set xnudge and ynudge to 0 for optimal results
The following plot displays the first two dimensions of the Principal Component Analysis of this dataset.
## Centering and scaling data matrix
## PC_ 1
## Positive: RPL10, MT1X, EEF1A1, RPL7, S100A6, DDIT4, RPL7A, RPL3, RPL23, RPS4X
## MT2A, MT1E, GNB2L1, RPS3, KRT18, RPL4, DUSP23, RPS19, RPSA, KRT7
## KRT8, IGFBP2, TKT, DUT, JUNB, ZFP36L2, GADD45B, TPM2, MCM6, CDCA7
## Negative: MKI67, UBE2C, TUBA1B, TOP2A, TPX2, UBE2S, HIST1H4C, PRC1, TUBB4B, HIST1H1B
## NUSAP1, HMGB2, CDKN3, CENPF, KIF23, CENPE, HMMR, CDCA3, NDC80, PTTG1
## DYNLL1, ANLN, ASPM, CKS2, MAP1B, TUBA1C, HMGN2, HIST1H1A, DEPDC1, KIF11
## PC_ 2
## Positive: CLSPN, KIAA0101, DUT, MYBL2, HELLS, TK1, CENPM, FAM111B, TYMS, DHFR
## RRM2, E2F1, DNMT1, RNASEH2A, ATAD2, ACAT2, TMEM106C, PSMC3IP, CCNE2, CENPU
## ZWINT, CDC45, CDC6, MCM4, FAM111A, TCF19, RRM1, MCM3, RBBP8, DTL
## Negative: CCNB1, AURKA, PLK1, CDC20, HMMR, PRR11, ARL6IP1, CKS2, KIF20A, CENPA
## PTTG1, CENPE, PIF1, CCNB2, DEPDC1, KPNA2, KIF18A, DLGAP5, LGALS1, FAM83D
## CENPF, PSRC1, KIF14, CDCA3, RPL7A, HSP90B1, KNSTRN, NEK2, RPL23, LDHA
## PC_ 3
## Positive: COL8A1, KRT7, ANXA2, THBS1, OGFRL1, PPP1R14B, TMSB10, IGFBP3, EFEMP1, SERPINE1
## TMSB4X, FHL2, ENAH, CTGF, FLNA, S100A10, PAWR, IL18, TPM1, KRT17
## ACTB, ANKRD1, ALCAM, LGALS1, CAV1, ANXA3, TRAM2, TAGLN2, TAGLN, ACTN1
## Negative: FOS, GDF15, DDIT3, HMGN2, IER2, FDXR, RPS27L, HIST1H2AG, HIST1H1A, AREG
## HIST2H2AC, BTG1, BTG2, RP3-510D11.2, BAX, CD82, HIST1H2AL, PVRL4, HLA-B, UBE2T
## GCHFR, MDM2, GAMT, BTG3, HIST1H3G, HIST1H2AH, ARG2, HIST2H2BF, CDKN2C, HSPA1A
## PC_ 4
## Positive: EIF1, GAPDH, TRIB3, TMSB10, RNH1, PLP2, MYL6, MRPL54, TUBA1B, PFDN2
## SLC3A2, TNFRSF12A, HN1, KRT10, YWHAB, RPL22L1, ANXA3, RAB32, H2AFZ, DYNLL1
## UBE2S, RNASEH2A, H3F3B, HMGN2, UBB, HIST1H4C, NIFK, CDKN3, HMGB2, YWHAH
## Negative: PVRL4, MDM2, TP53I3, CDKN1A, BTG2, SULF2, CMBL, FDXR, DRAXIN, UNC5B-AS1
## NEAT1, KIAA1324, SLC52A1, ZMAT3, PHLDA3, TCEA3, RP3-510D11.2, CYSRT1, HES2, APOBEC3C
## CD82, INPP5D, GLS2, MAST4, PIDD1, ITIH5, CES2, WNT4, SERPINB5, CYFIP2
## PC_ 5
## Positive: MALAT1, MT-ND6, MT-ND2, HNRNPU, NCL, NEAT1, TAF15, EGR1, AP000769.1, SYNE2
## PHF3, RIF1, RAD21, MT-ND4, PEG10, HNRNPA3, RPL23, FOS, SPTBN1, TGFBR2
## PIK3R1, ATRX, BRCA2, BCLAF1, MACF1, LINC00657, MSH6, CLTC, PCM1, CHD4
## Negative: CLIC1, PHPT1, TNFRSF12A, RHOC, IGFBP7, PVRL4, GAPDH, TP53I3, RPS19, FHL2
## S100A11, S100A10, RAB32, GADD45A, KRT17, PFDN2, FADS3, TUBB4B, FDXR, ZFAS1
## TAGLN2, RPS3, RPS27L, EIF1, CD151, HRAS, CSRP1, RNH1, S100A16, RHOD
Clustering results displayed using the UMAP method, with 10 dimensions and a resolution of 0.5.
## Computing nearest neighbor graph
## Computing SNN
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
##
## Number of nodes: 11831
## Number of edges: 370888
##
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.8355
## Number of communities: 8
## Elapsed time: 4 seconds
##
##
## Table: Number of cells in each cluster
##
## |Var1 | Freq|
## |:----|----:|
## |0 | 2674|
## |1 | 2193|
## |2 | 1956|
## |3 | 1768|
## |4 | 1413|
## |5 | 1283|
## |6 | 301|
## |7 | 243|
## 21:41:13 UMAP embedding parameters a = 0.9922 b = 1.112
## 21:41:13 Read 11831 rows and found 10 numeric columns
## 21:41:13 Using Annoy for neighbor search, n_neighbors = 30
## 21:41:13 Building Annoy index with metric = cosine, n_trees = 50
## 0% 10 20 30 40 50 60 70 80 90 100%
## [----|----|----|----|----|----|----|----|----|----|
## **************************************************|
## 21:41:14 Writing NN index file to temp file /scratch/local/46218579/Rtmpi9Ds84/file5932cf59c09
## 21:41:14 Searching Annoy index using 1 thread, search_k = 3000
## 21:41:20 Annoy recall = 100%
## 21:41:20 Commencing smooth kNN distance calibration using 1 thread
## 21:41:21 Initializing from normalized Laplacian + noise
## 21:41:21 Commencing optimization for 200 epochs, with 469964 positive edges
## 21:41:27 Optimization finished
Clustering results displayed using the t-SNE method:
These results were saved to results saved to E76-KP.rds. This file can be imported into R using the readRDS() function.
## # A tibble: 16 × 7
## # Groups: cluster [8]
## p_val avg_log2FC pct.1 pct.2 p_val_adj cluster gene
## <dbl> <dbl> <dbl> <dbl> <dbl> <fct> <chr>
## 1 0 0.781 0.997 0.952 0 0 DUT
## 2 0 0.733 0.997 0.931 0 0 TK1
## 3 0 1.70 1 0.971 0 1 HIST1H4C
## 4 0 1.51 0.999 0.905 0 1 H1F0
## 5 0 1.18 0.979 0.889 0 2 MT1X
## 6 0 1.05 0.957 0.807 0 2 DDIT4
## 7 5.59e-202 0.706 0.988 0.921 1.09e-197 3 COL8A1
## 8 1.80e-153 0.647 0.761 0.499 3.52e-149 3 THBS1
## 9 0 2.93 0.984 0.378 0 4 CCNB1
## 10 0 2.43 0.996 0.457 0 4 HMMR
## 11 0 1.09 1 0.973 0 5 HIST1H4C
## 12 2.51e-306 0.820 0.954 0.466 4.92e-302 5 HIST1H1B
## 13 4.61e- 14 1.93 0.382 0.265 9.01e- 10 6 AP000769.1
## 14 4.06e- 9 1.58 0.296 0.591 7.95e- 5 6 SAA1
## 15 1.55e-155 2.96 0.979 0.477 3.02e-151 7 MDM2
## 16 3.73e-112 3.15 0.922 0.505 7.29e-108 7 TP53I3
Marker genes for this sample saved to E76-KP.markers.csv. This is a tab-delimited file that can be opened with Excel. The following tables list the top 12 marker genes for each cluster.
p_val | avg_log2FC | pct.1 | pct.2 | p_val_adj | cluster | gene | |
---|---|---|---|---|---|---|---|
DUT | 0 | 0.7808906 | 0.997 | 0.952 | 0 | 0 | DUT |
TK1 | 0 | 0.7333225 | 0.997 | 0.931 | 0 | 0 | TK1 |
RPS3A | 0 | 0.2719043 | 1.000 | 0.999 | 0 | 0 | RPS3A |
GINS2 | 0 | 0.5280579 | 0.941 | 0.788 | 0 | 0 | GINS2 |
KIAA0101 | 0 | 0.4970896 | 0.996 | 0.945 | 0 | 0 | KIAA0101 |
E2F1 | 0 | 0.5005704 | 0.926 | 0.741 | 0 | 0 | E2F1 |
RPS4X | 0 | 0.3045369 | 0.999 | 0.997 | 0 | 0 | RPS4X |
PCNA | 0 | 0.4790178 | 0.979 | 0.897 | 0 | 0 | PCNA |
TYMS | 0 | 0.5206032 | 0.949 | 0.795 | 0 | 0 | TYMS |
IFITM3 | 0 | 0.4224553 | 0.999 | 0.990 | 0 | 0 | IFITM3 |
EEF1B2 | 0 | 0.2574588 | 1.000 | 0.996 | 0 | 0 | EEF1B2 |
RPL10 | 0 | 0.2910058 | 1.000 | 1.000 | 0 | 0 | RPL10 |
p_val | avg_log2FC | pct.1 | pct.2 | p_val_adj | cluster | gene | |
---|---|---|---|---|---|---|---|
HIST1H4C | 0 | 1.6967600 | 1.000 | 0.971 | 0 | 1 | HIST1H4C |
H1F0 | 0 | 1.5054951 | 0.999 | 0.905 | 0 | 1 | H1F0 |
TNFRSF12A | 0 | 1.0434993 | 0.998 | 0.975 | 0 | 1 | TNFRSF12A |
UBC | 0 | 1.0281570 | 1.000 | 0.988 | 0 | 1 | UBC |
HIST1H1A | 0 | 0.9874826 | 0.912 | 0.368 | 0 | 1 | HIST1H1A |
HIST1H1B | 0 | 0.9577795 | 0.960 | 0.419 | 0 | 1 | HIST1H1B |
HIST2H2AC | 0 | 0.9471416 | 0.947 | 0.519 | 0 | 1 | HIST2H2AC |
TUBA1B | 0 | 0.9326811 | 1.000 | 0.985 | 0 | 1 | TUBA1B |
HIST1H1E | 0 | 0.9232609 | 0.995 | 0.775 | 0 | 1 | HIST1H1E |
RAB32 | 0 | 0.9157641 | 0.996 | 0.910 | 0 | 1 | RAB32 |
GDF15 | 0 | 0.8938917 | 0.990 | 0.855 | 0 | 1 | GDF15 |
H1FX | 0 | 0.8555449 | 0.990 | 0.891 | 0 | 1 | H1FX |
p_val | avg_log2FC | pct.1 | pct.2 | p_val_adj | cluster | gene | |
---|---|---|---|---|---|---|---|
MT1X | 0 | 1.1839377 | 0.979 | 0.889 | 0 | 2 | MT1X |
DDIT4 | 0 | 1.0521316 | 0.957 | 0.807 | 0 | 2 | DDIT4 |
RPL23 | 0 | 1.0472712 | 1.000 | 0.993 | 0 | 2 | RPL23 |
RPL7 | 0 | 0.7942620 | 0.999 | 0.994 | 0 | 2 | RPL7 |
RPL7A | 0 | 0.7219394 | 1.000 | 0.998 | 0 | 2 | RPL7A |
MARCKS | 0 | 0.7038028 | 0.993 | 0.959 | 0 | 2 | MARCKS |
RPS25 | 0 | 0.6704229 | 1.000 | 0.996 | 0 | 2 | RPS25 |
RPL27A | 0 | 0.6521518 | 0.999 | 0.998 | 0 | 2 | RPL27A |
RPL34 | 0 | 0.6393907 | 0.999 | 0.993 | 0 | 2 | RPL34 |
RPS14 | 0 | 0.5520879 | 1.000 | 0.997 | 0 | 2 | RPS14 |
RPL101 | 0 | 0.5418760 | 1.000 | 1.000 | 0 | 2 | RPL10 |
EEF1A1 | 0 | 0.5091745 | 1.000 | 0.999 | 0 | 2 | EEF1A1 |
p_val | avg_log2FC | pct.1 | pct.2 | p_val_adj | cluster | gene | |
---|---|---|---|---|---|---|---|
DUT1 | 0 | 0.5335540 | 0.998 | 0.956 | 0 | 3 | DUT |
COL8A11 | 0 | 0.7062640 | 0.988 | 0.921 | 0 | 3 | COL8A1 |
PRKDC | 0 | 0.4356001 | 0.996 | 0.970 | 0 | 3 | PRKDC |
TPM3 | 0 | 0.4058355 | 0.997 | 0.977 | 0 | 3 | TPM3 |
TK11 | 0 | 0.4665570 | 0.998 | 0.936 | 0 | 3 | TK1 |
NCL | 0 | 0.3683779 | 0.999 | 0.990 | 0 | 3 | NCL |
DHFR1 | 0 | 0.4567680 | 0.977 | 0.828 | 0 | 3 | DHFR |
MSH6 | 0 | 0.4614454 | 0.913 | 0.712 | 0 | 3 | MSH6 |
ALCAM | 0 | 0.5731826 | 0.980 | 0.909 | 0 | 3 | ALCAM |
MCM31 | 0 | 0.4436140 | 0.955 | 0.795 | 0 | 3 | MCM3 |
MCM4 | 0 | 0.4386087 | 0.932 | 0.738 | 0 | 3 | MCM4 |
OGFRL1 | 0 | 0.4382112 | 0.994 | 0.959 | 0 | 3 | OGFRL1 |
p_val | avg_log2FC | pct.1 | pct.2 | p_val_adj | cluster | gene | |
---|---|---|---|---|---|---|---|
CCNB1 | 0 | 2.931391 | 0.984 | 0.378 | 0 | 4 | CCNB1 |
HMMR | 0 | 2.427292 | 0.996 | 0.457 | 0 | 4 | HMMR |
CKS2 | 0 | 2.340469 | 1.000 | 0.828 | 0 | 4 | CKS2 |
UBE2C | 0 | 2.294555 | 0.973 | 0.781 | 0 | 4 | UBE2C |
AURKA | 0 | 2.261870 | 0.961 | 0.209 | 0 | 4 | AURKA |
PTTG1 | 0 | 2.224650 | 0.999 | 0.719 | 0 | 4 | PTTG1 |
CENPF | 0 | 1.971110 | 0.999 | 0.799 | 0 | 4 | CENPF |
UBE2S1 | 0 | 1.959481 | 1.000 | 0.926 | 0 | 4 | UBE2S |
TOP2A | 0 | 1.945135 | 0.990 | 0.792 | 0 | 4 | TOP2A |
CDC20 | 0 | 1.939483 | 0.965 | 0.368 | 0 | 4 | CDC20 |
CENPE | 0 | 1.871429 | 0.977 | 0.411 | 0 | 4 | CENPE |
TPX2 | 0 | 1.858649 | 0.999 | 0.703 | 0 | 4 | TPX2 |
p_val | avg_log2FC | pct.1 | pct.2 | p_val_adj | cluster | gene | |
---|---|---|---|---|---|---|---|
HIST1H4C1 | 0 | 1.0890944 | 1.000 | 0.973 | 0 | 5 | HIST1H4C |
HIST1H1B1 | 0 | 0.8197855 | 0.954 | 0.466 | 0 | 5 | HIST1H1B |
HIST1H2AL1 | 0 | 0.5277395 | 0.672 | 0.216 | 0 | 5 | HIST1H2AL |
HIST1H2AH1 | 0 | 0.4375728 | 0.625 | 0.206 | 0 | 5 | HIST1H2AH |
HIST1H1A1 | 0 | 0.7334039 | 0.871 | 0.420 | 0 | 5 | HIST1H1A |
HIST1H2AG1 | 0 | 0.4842047 | 0.702 | 0.267 | 0 | 5 | HIST1H2AG |
HIST1H1C1 | 0 | 0.7587779 | 0.987 | 0.787 | 0 | 5 | HIST1H1C |
HIST1H3B1 | 0 | 0.4791797 | 0.760 | 0.312 | 0 | 5 | HIST1H3B |
HIST2H2AC1 | 0 | 0.6947366 | 0.931 | 0.558 | 0 | 5 | HIST2H2AC |
CLSPN1 | 0 | 0.6110546 | 0.999 | 0.898 | 0 | 5 | CLSPN |
HIST1H1E1 | 0 | 0.6600555 | 0.991 | 0.795 | 0 | 5 | HIST1H1E |
TMEM106C1 | 0 | 0.5836801 | 0.995 | 0.889 | 0 | 5 | TMEM106C |
p_val | avg_log2FC | pct.1 | pct.2 | p_val_adj | cluster | gene | |
---|---|---|---|---|---|---|---|
MT-ND51 | 0 | 0.8182990 | 0.930 | 0.999 | 0 | 6 | MT-ND5 |
COL4A4 | 0 | 0.2771294 | 0.209 | 0.621 | 0 | 6 | COL4A4 |
NEAT11 | 0 | 1.0127409 | 0.814 | 0.959 | 0 | 6 | NEAT1 |
SLC7A11 | 0 | 0.3071473 | 0.209 | 0.585 | 0 | 6 | SLC7A11 |
CD241 | 0 | 0.2935826 | 0.213 | 0.585 | 0 | 6 | CD24 |
ARHGEF28 | 0 | 0.3095013 | 0.269 | 0.672 | 0 | 6 | ARHGEF28 |
LAMB1 | 0 | 0.2593780 | 0.243 | 0.636 | 0 | 6 | LAMB1 |
MALAT1 | 0 | 0.8292176 | 0.977 | 0.999 | 0 | 6 | MALAT1 |
ABI2 | 0 | 0.3025271 | 0.389 | 0.822 | 0 | 6 | ABI2 |
SLC25A24 | 0 | 0.2626855 | 0.306 | 0.681 | 0 | 6 | SLC25A24 |
MACF11 | 0 | 0.3125926 | 0.336 | 0.725 | 0 | 6 | MACF1 |
NSD1 | 0 | 0.2632734 | 0.382 | 0.799 | 0 | 6 | NSD1 |
p_val | avg_log2FC | pct.1 | pct.2 | p_val_adj | cluster | gene | |
---|---|---|---|---|---|---|---|
PVRL4 | 0 | 1.5252676 | 0.794 | 0.010 | 0 | 7 | PVRL4 |
UNC5B-AS1 | 0 | 0.6440566 | 0.502 | 0.025 | 0 | 7 | UNC5B-AS1 |
RRAD | 0 | 0.5410382 | 0.284 | 0.007 | 0 | 7 | RRAD |
KIAA1324 | 0 | 0.5026825 | 0.354 | 0.005 | 0 | 7 | KIAA1324 |
DRAXIN | 0 | 0.4787210 | 0.346 | 0.001 | 0 | 7 | DRAXIN |
SLC52A1 | 0 | 0.3110880 | 0.317 | 0.003 | 0 | 7 | SLC52A1 |
INPP5D | 0 | 0.2815712 | 0.317 | 0.004 | 0 | 7 | INPP5D |
RP3-510D11.2 | 0 | 0.6390548 | 0.658 | 0.065 | 0 | 7 | RP3-510D11.2 |
HES2 | 0 | 0.4662985 | 0.465 | 0.030 | 0 | 7 | HES2 |
ITIH5 | 0 | 0.3833306 | 0.391 | 0.022 | 0 | 7 | ITIH5 |
SULF2 | 0 | 0.8225731 | 0.634 | 0.068 | 0 | 7 | SULF2 |
TCEA3 | 0 | 0.6415703 | 0.626 | 0.075 | 0 | 7 | TCEA3 |