Sequana: a set of NGS pipelines
This page serves as an additional set of resources for
the Sequana project. Sequana is comprised of a Python library, the
documentation of which is accessible on Read The Docs.
The source code for the Python library can be found on
GitHub.
One of the primary objective of the Sequana project is to offer a collection of NGS pipelines.
The pipelines currently available are located on PyPI and the Sequana GitHub organisation page,
each having their own repository. For instance, the RNA-seq pipeline can be found at sequana/rnaseq.
This page is also a place for providing supplementary information to the main repositories.
Pipelines
Sequana provides a collection of NGS pipelines. Each pipeline lives in its own repository on the
Sequana GitHub organisation. Below are some highlighted pipelines:
sequana_fastqc
A quality-control pipeline that runs FastQC and MultiQC on raw FASTQ files to assess sequencing quality.
sequana_rnaseq
An RNA-seq pipeline that covers trimming, alignment, feature counting, and differential expression analysis.
sequana_variant_calling
A variant-calling pipeline that performs read trimming, alignment, and SNP/indel detection
using standard tools such as BWA and freebayes, with interactive HTML reports.
sequana_demultiplex
A demultiplexing pipeline built around bcl2fastq / bcl-convert that converts Illumina
BCL files into FASTQ files, validates sample sheets, and generates MultiQC summary reports.
Published Tools
The following tools have been published in peer-reviewed journals. Each tool is also available
as a standalone application within the Sequana Python library.
Unpublished Tools
In addition to its NGS pipelines, Sequana ships a set of standalone sub-commands that can be
invoked directly from the command line (sequana <subcommand>). These tools
cover common NGS utility tasks and are distributed as part of the
sequana Python package.
sequana coverage
Analyse the depth of coverage of a BAM/BED file across an entire genome or targeted
regions, detect abnormal coverage regions, and generate an interactive HTML report.
sequana taxonomy
Perform taxonomic classification of sequencing reads using Kraken2 or other back-ends
and produce summary plots and reports for metagenomics studies.
sequana summary
Compute and display summary statistics (read counts, GC content, quality metrics) for
one or more FASTQ files without running a full pipeline.
sequana enrichment
Perform Gene Ontology (GO) and KEGG pathway enrichment analysis on a list of
differentially expressed genes, with interactive visualisations.
sequana samplesheet
Validate and convert Illumina sample-sheet files, checking for common formatting errors
before demultiplexing with bcl2fastq or similar tools.
sequana gtf_fixer
Fix common inconsistencies in GTF/GFF annotation files (missing gene-level entries,
non-standard attributes) so they are compatible with downstream RNA-seq tools.
Talks
Applications
Check your Illumina Sample Sheet
Report examples (click on the image)
The following was generated with the Sequana Coverage standalone application and keep here a it is reference in other places. This is the coverage along a virus genome where the black dot line represent the coverage, the blue is the median coverage, and red lines are the top and bottow thresholds for detecting event of interest.