Logo Alithea

Library preparation and sequencing data analysis report


Project name (ID): 251126_CRL_HepaRG_redo_new_genome


Overview

Library preparation summary

Library name Number of samples
L030625CR01_01 384
L030625CR02_01 384

Library sequencing summary


PASSWARNING

Library ? FASTQ name of the BRB-seq library PF_reads ? The total number of reads Demultiplexed reads ? The number of demultiplexed reads % Demultiplexed ? Defined as nb. of demultiplexed reads / total reads Avg. nb. reads/sample ? Defined as the number of demultiplexed reads / the number of used samples q30 R1 ? The rate (0..1) of bases of read 1 (BC+UMI) with a quality score of 30 or higher q30 R2 ? The rate (0..1) of bases of read 2 (genomic) with a quality score of 30 or higher
L030625CR01_01 457293782 440090909 96.24 1,146,070 0.966112 0.95123
L030625CR02_01 385514061 372020558 96.5 968,803.5 0.963419 0.936913

Library alignment summary


PASSWARNING

Library ? FASTQ name of the BRB-seq library Genome assembly ? Reference genome used for alignment Nb. Mapped ? Total number of reads mapped against the genome % Mapped ? % of the reads mapped against the genome Nb. Mapped ? Total number of quantified reads % Exons ? Total % of quantified counts Nb. genes ? Average number of detected genes across used samples Nb. ERCC ? Total number of reads mapped to ERCC spike-ins % ERCC ? Total % of reads mapped to ERCC spike-ins
L030625CR01_01 homo_sapiens
GRCh38
114
336,302,780 76.42 277,802,301 63.12 12947.28 3,348,977 1.52
L030625CR02_01 homo_sapiens
GRCh38
114
291,492,558 78.35 251,288,513 67.55 13206.85 5,019,679 2.56

L030625CR01_01

Number of sequencing reads, per sample - L030625CR01_01 ? Per sample identification statistics, presented in barplot and plate view. The order of samples in barplot can be changed to default, row-wise and column-wise.

Hover for detailed sample information.

Number of RNA spike-ins reads, per sample - L030625CR01_01 ? Per-sample ERCC (RNA spike-in) statistics are displayed in both bar plot and plate view. The sample order in the bar plot can be adjusted to default, row-wise, or column-wise.

Hover for detailed sample information.

Spike-ins concentration ? This plot shows the theoretical ERCC concentrations from the Thermo Fisher Scientific Mix 1 spike-in list, alongside the observed ERCC expression levels, enabling a comparison between expected and actual RNA spike-in performance.
Spike-ins identification ? This plot compares the number of reads per sample with the percentage of identified ERCC, showing both theoretical and observed traces. The theoretical ERCC percentage is calculated by averaging ERCC counts across samples and dividing this constant by each sample’s sequencing depth, then multiplying by 100%.
Sample identification compared to spike-ins ? The plot shows the comparison of variability between total reads per sample and identified spike-ins, where lower ERCC variability suggests minimal technical noise.
Spike-ins clisters ? The PCA plot clusters samples based on normalized ERCC expression values. Samples that cluster closely together exhibit similar ERCC expression, suggesting consistent technical performance across those samples.
Alignment statistics, per sample - L030625CR01_01 ? Per sample total alignment statistics perfomed by STARsolo, presented in barplot and plate view. The order of samples in barplot can be changed to default, row-wise and column-wise.

Hover for detailed sample information.

Number of detected genes, per sample - L030625CR01_01 ? Per sample gene detection statistics, presented in barplot and plate view. The order of samples in barplot can be changed to default, row-wise and column-wise.

Hover for detailed sample information.

Compare sample-wise statistics - L030625CR01_01 ? This scatter plot enables comparison of various individual statistics, such as the number of reads, genes, alignment percentage, and ERCC counts per sample. Both the x and y axes can be customized using drop-down menus to explore relationships between these metrics.

Hover for detailed sample information.

Principle Component Analysis - L030625CR01_01 ? The PCA plot displays the general expression profiles of all samples, with dots sized according to read counts. It helps identify patterns and clusters in gene expression, showing how samples with similar or differing profiles group together based on their overall expression patterns.

Hover for detailed sample information.

Top-10 most expressed genes across samples - L030625CR01_01 ? The pie chart shows the top 10 most highly expressed genes, with each slice representing their percentage of total gene expression.

Hover for detailed sample information.

Top-10 most expressed biotypes across samples - L030625CR01_01 ? The pie chart shows the top 10 most highly expressed gene biotypes, with each slice representing their percentage of total gene expression.

Hover for detailed sample information.

L030625CR02_01

Number of sequencing reads, per sample - L030625CR02_01 ? Per sample identification statistics, presented in barplot and plate view. The order of samples in barplot can be changed to default, row-wise and column-wise.

Hover for detailed sample information.

Number of RNA spike-ins reads, per sample - L030625CR02_01 ? Per-sample ERCC (RNA spike-in) statistics are displayed in both bar plot and plate view. The sample order in the bar plot can be adjusted to default, row-wise, or column-wise.

Hover for detailed sample information.

Spike-ins concentration ? This plot shows the theoretical ERCC concentrations from the Thermo Fisher Scientific Mix 1 spike-in list, alongside the observed ERCC expression levels, enabling a comparison between expected and actual RNA spike-in performance.
Spike-ins identification ? This plot compares the number of reads per sample with the percentage of identified ERCC, showing both theoretical and observed traces. The theoretical ERCC percentage is calculated by averaging ERCC counts across samples and dividing this constant by each sample’s sequencing depth, then multiplying by 100%.
Sample identification compared to spike-ins ? The plot shows the comparison of variability between total reads per sample and identified spike-ins, where lower ERCC variability suggests minimal technical noise.
Spike-ins clisters ? The PCA plot clusters samples based on normalized ERCC expression values. Samples that cluster closely together exhibit similar ERCC expression, suggesting consistent technical performance across those samples.
Alignment statistics, per sample - L030625CR02_01 ? Per sample total alignment statistics perfomed by STARsolo, presented in barplot and plate view. The order of samples in barplot can be changed to default, row-wise and column-wise.

Hover for detailed sample information.

Number of detected genes, per sample - L030625CR02_01 ? Per sample gene detection statistics, presented in barplot and plate view. The order of samples in barplot can be changed to default, row-wise and column-wise.

Hover for detailed sample information.

Compare sample-wise statistics - L030625CR02_01 ? This scatter plot enables comparison of various individual statistics, such as the number of reads, genes, alignment percentage, and ERCC counts per sample. Both the x and y axes can be customized using drop-down menus to explore relationships between these metrics.

Hover for detailed sample information.

Principle Component Analysis - L030625CR02_01 ? The PCA plot displays the general expression profiles of all samples, with dots sized according to read counts. It helps identify patterns and clusters in gene expression, showing how samples with similar or differing profiles group together based on their overall expression patterns.

Hover for detailed sample information.

Top-10 most expressed genes across samples - L030625CR02_01 ? The pie chart shows the top 10 most highly expressed genes, with each slice representing their percentage of total gene expression.

Hover for detailed sample information.

Top-10 most expressed biotypes across samples - L030625CR02_01 ? The pie chart shows the top 10 most highly expressed gene biotypes, with each slice representing their percentage of total gene expression.

Hover for detailed sample information.

UMI statistics

Advanced QC: UMI duplication statistics

Library ? FASTQ name of the BRB-seq library Nb UMI counts ? Number of the UMI counts (i.e. demultiplexed & mapped to the exons) % UMI counts ? Overall % of the reads comming from unique molecules
L030625CR01_01 198181530 71.34
L030625CR02_01 149742272 59.59

rRNA statistics

Advanced QC: rRNA mapping statistics

Library ? FASTQ name of the BRB-seq library Nb of rRNA ? Number of reads mapped to rRNA using bbmap tool % of rRNA ? Overall % of the reads comming from rRNA
L030625CR01_01 122240 12.22
L030625CR02_01 79004 7.90