Summary
Next Generation Sequencing
Debian Med bioinformatics applications usable in Next Generation Sequencing
It aims at gettting packages which specialize in the processing or interpretation of
data generated with next- (and later-) generation high-thoughput sequencing technologies.
Description
For a better overview of the project's availability as a Debian package, each head row has a color code according to this scheme:
If you discover a project which looks like a good candidate for Debian Med
to you, or if you have prepared an unofficial Debian package, please do not hesitate to
send a description of that project to the Debian Med mailing list
Links to other tasks
|
Debian Med Next Generation Sequencing packages
Official Debian packages with high relevance
anfo
Short Read Aligner/Mapper from MPG
|
Versions of package anfo |
Release | Version | Architectures |
sid | 0.98-9 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x |
jessie | 0.98-4 | amd64,armel,armhf,i386 |
stretch | 0.98-5 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
buster | 0.98-7 | amd64,arm64,armhf,i386 |
bullseye | 0.98-8 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
|
License: DFSG free
|
Anfo is a mapper in the spirit of Soap/Maq/Bowtie, but its implementation takes
more after BLAST/BLAT. It's most useful for the alignment of sequencing reads
where the DNA sequence is somehow modified (think ancient DNA or bisulphite
treatment) and/or there is more divergence between sample and reference than
what fast mappers will handle gracefully (say the reference genome is missing
and a related species is used instead).
|
|
arden
specificity control for read alignments using an artificial reference
|
Versions of package arden |
Release | Version | Architectures |
stretch | 1.0-3 | all |
trixie | 1.0-6 | all |
bookworm | 1.0-5 | all |
sid | 1.0-6 | all |
bullseye | 1.0-5 | all |
buster | 1.0-4 | all |
jessie | 1.0-1 | amd64,armel,armhf,i386 |
|
License: DFSG free
|
ARDEN (Artificial Reference Driven Estimation of false positives in NGS
data) is a novel benchmark that estimates error rates based on real
experimental reads and an additionally generated artificial reference
genome. It allows the computation of error rates specifically for a
dataset and the construction of a ROC-curve. Thereby, it can be used to
optimize parameters for read mappers, to select read mappers for a
specific problem or also to filter alignments based on quality
estimation.
|
|
art-nextgen-simulation-tools
simulation tools to generate synthetic next-generation sequencing reads
|
Versions of package art-nextgen-simulation-tools |
Release | Version | Architectures |
bullseye | 20160605+dfsg-4 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 20160605+dfsg-3 | amd64,arm64,armhf,i386 |
stretch | 20160605+dfsg-2 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
sid | 20160605+dfsg-5 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
trixie | 20160605+dfsg-5 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 20160605+dfsg-4 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
|
License: DFSG free
|
ART is a set of simulation tools to generate synthetic next-generation
sequencing reads. ART simulates sequencing reads by mimicking real
sequencing process with empirical error models or quality profiles
summarized from large recalibrated sequencing data. ART can also
simulate reads using user own read error model or quality profiles. ART
supports simulation of single-end, paired-end/mate-pair reads of three
major commercial next-generation sequencing platforms: Illumina's
Solexa, Roche's 454 and Applied Biosystems' SOLiD. ART can be used to
test or benchmark a variety of method or tools for next-generation
sequencing data analysis, including read alignment, de novo assembly,
SNP and structure variation discovery. ART was used as a primary tool
for the simulation study of the 1000 Genomes Project . ART is
implemented in C++ with optimized algorithms and is highly efficient in
read simulation. ART outputs reads in the FASTQ format, and alignments
in the ALN format. ART can also generate alignments in the SAM
alignment or UCSC BED file format. ART can be used together with genome
variants simulators (e.g. VarSim) for evaluating variant calling tools
or methods.
|
|
artfastqgenerator
outputs artificial FASTQ files derived from a reference genome
|
Versions of package artfastqgenerator |
Release | Version | Architectures |
bullseye | 0.0.20150519-4 | all |
bookworm | 0.0.20150519-4 | all |
trixie | 0.0.20150519-5 | all |
sid | 0.0.20150519-5 | all |
stretch | 0.0.20150519-2 | all |
buster | 0.0.20150519-3 | all |
|
License: DFSG free
|
ArtificialFastqGenerator takes the reference genome (in FASTA format) as
input and outputs artificial FASTQ files in the Sanger format. It can
accept Phred base quality scores from existing FASTQ files, and use them
to simulate sequencing errors. Since the artificial FASTQs are derived
from the reference genome, the reference genome provides a gold-standard
for calling variants (Single Nucleotide Polymorphisms (SNPs) and
insertions and deletions (indels)). This enables evaluation of a Next
Generation Sequencing (NGS) analysis pipeline which aligns reads to the
reference genome and then calls the variants.
|
|
bamtools
toolkit for manipulating BAM (genome alignment) files
|
Versions of package bamtools |
Release | Version | Architectures |
stretch | 2.4.1+dfsg-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
trixie | 2.5.2+dfsg-6 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
sid | 2.5.2+dfsg-6 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
buster | 2.5.1+dfsg-3 | amd64,arm64,armhf,i386 |
bullseye | 2.5.1+dfsg-9 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
jessie | 2.3.0+dfsg-2 | amd64,armel,armhf,i386 |
bookworm | 2.5.2+dfsg-4 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
|
License: DFSG free
|
BamTools facilitates research analysis and data management using BAM
files. It copes with the enormous amount of data produced by current
sequencing technologies that is typically stored in compressed, binary
formats that are not easily handled by the text-based parsers commonly
used in bioinformatics research.
BamTools provides both a C++ API for BAM file support as well as a
command-line toolkit.
This is the bamtools command-line toolkit.
Available bamtools commands:
convert Converts between BAM and a number of other formats
count Prints number of alignments in BAM file(s)
coverage Prints coverage statistics from the input BAM file
filter Filters BAM file(s) by user-specified criteria
header Prints BAM header information
index Generates index for BAM file
merge Merge multiple BAM files into single file
random Select random alignments from existing BAM file(s), intended more
as a testing tool.
resolve Resolves paired-end reads (marking the IsProperPair flag as needed)
revert Removes duplicate marks and restores original base qualities
sort Sorts the BAM file according to some criteria
split Splits a BAM file on user-specified property, creating a new BAM
output file for each value found
stats Prints some basic statistics from input BAM file(s)
The package is enhanced by the following packages:
multiqc
|
|
bcftools
genomic variant calling and manipulation of VCF/BCF files
|
Versions of package bcftools |
Release | Version | Architectures |
buster | 1.9-1 | amd64,arm64,armhf |
sid | 1.20-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
trixie | 1.20-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 1.16-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el |
stretch | 1.3.1-1 | amd64,arm64,armel,mips64el,mipsel,ppc64el |
bullseye | 1.11-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el |
stretch-backports | 1.8-1~bpo9+1 | amd64,arm64,armel,armhf,mips64el,mipsel,ppc64el |
upstream | 1.21 |
|
License: DFSG free
|
BCFtools is a set of utilities that manipulate variant calls in the
Variant Call Format (VCF) and its binary counterpart BCF. All commands work
transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.
The package is enhanced by the following packages:
multiqc
|
|
bedtools
suite of utilities for comparing genomic features
|
Versions of package bedtools |
Release | Version | Architectures |
stretch | 2.26.0+dfsg-3 | amd64,arm64,armel,i386,mips64el,mipsel,ppc64el |
jessie | 2.21.0-1 | amd64,armhf,i386 |
sid | 2.31.1+dfsg-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
trixie | 2.31.1+dfsg-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 2.30.0+dfsg-3 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bullseye | 2.30.0+dfsg-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 2.27.1+dfsg-4 | amd64,arm64,armhf |
Debtags of package bedtools: |
field | biology, biology:bioinformatics |
interface | commandline |
role | program |
scope | suite |
use | analysing, comparing, converting, filtering |
works-with | biological-sequence |
|
License: DFSG free
|
The BEDTools utilities allow one to address common genomics tasks such as
finding feature overlaps and computing coverage. The utilities are largely
based on four widely-used file formats: BED, GFF/GTF, VCF, and SAM/BAM. Using
BEDTools, one can develop sophisticated pipelines that answer complicated
research questions by streaming several BEDTools together.
The groupBy utility is distributed in the filo package.
|
|
berkeley-express
Streaming quantification for high-throughput sequencing
|
Versions of package berkeley-express |
Release | Version | Architectures |
trixie | 1.5.3+dfsg-3 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
stretch | 1.5.1-3 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
buster | 1.5.2+dfsg-1 | amd64,arm64,armhf,i386 |
bullseye | 1.5.3+dfsg-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bookworm | 1.5.3+dfsg-3 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
sid | 1.5.3+dfsg-3 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
|
License: DFSG free
|
eXpress is a streaming tool for quantifying the abundances of a set of
target sequences from sampled subsequences. Example applications include
transcript-level RNA-Seq quantification, allele-specific/haplotype
expression analysis (from RNA-Seq), transcription factor binding
quantification in ChIP-Seq, and analysis of metagenomic data. It is
based on an online-EM algorithm that results in space (memory)
requirements proportional to the total size of the target sequences and
time requirements that are proportional to the number of sampled
fragments. Thus, in applications such as RNA-Seq, eXpress can accurately
quantify much larger samples than other currently available tools
greatly reducing computing infrastructure requirements. eXpress can be
used to build lightweight high-throughput sequencing processing
pipelines when coupled with a streaming aligner (such as Bowtie), as
output can be piped directly into eXpress, effectively eliminating the
need to store read alignments in memory or on disk.
In an analysis of the performance of eXpress for RNA-Seq data, it was
observed that this efficiency does not come at a cost of accuracy.
eXpress is more accurate than other available tools, even when limited
to smaller datasets that do not require such efficiency. Moreover, like
the Cufflinks program, eXpress can be used to estimate transcript
abundances in multi-isoform genes. eXpress is also able to resolve
multi-mappings of reads across gene families, and does not require a
reference genome so that it can be used in conjunction with de novo
assemblers such as Trinity, Oases, or Trans-ABySS. The underlying model
is based on previously described probabilistic models developed for
RNA-Seq but is applicable to other settings where target sequences are
sampled, and includes parameters for fragment length distributions,
errors in reads, and sequence-specific fragment bias.
eXpress can be used to resolve ambiguous mappings in other
high-throughput sequencing based applications. The only required inputs
to eXpress are a set of target sequences and a set of sequenced
fragments multiply-aligned to them. While these target sequences will
often be gene isoforms, they need not be. Haplotypes can be used as the
reference for allele-specific expression analysis, binding regions for
ChIP-Seq, or target genomes in metagenomics experiments. eXpress is
useful in any analysis where reads multi-map to sequences that differ in
abundance.
|
|
bio-rainbow
clustering and assembling short reads for bioinformatics
|
Versions of package bio-rainbow |
Release | Version | Architectures |
buster | 2.0.4+dfsg-1 | amd64,arm64,armhf,i386 |
stretch | 2.0.4-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
bullseye | 2.0.4+dfsg-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bookworm | 2.0.4+dfsg-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
trixie | 2.0.4+dfsg-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
sid | 2.0.4+dfsg-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
|
License: DFSG free
|
Efficient tool for clustering and assembling short reads,
especially for RAD.
Rainbow is developed to provide an ultra-fast and memory-efficient
solution to clustering and assembling short reads produced by RAD-seq.
First, Rainbow clusters reads using a spaced seed method. Then, Rainbow
implements a heterozygote calling like strategy to divide potential
groups into haplotypes in a top-down manner. long a guided tree, it
iteratively merges sibling leaves in a bottom-up manner if they are
similar enough. Here, the similarity is defined by comparing the 2nd
reads of a RAD segment. This approach tries to collapse heterozygote
while discriminate repetitive sequences. At last, Rainbow uses a greedy
algorithm to locally assemble merged reads into contigs. Rainbow not
only outputs the optimal but also suboptimal assembly results. Based on
simulation and a real guppy RAD-seq data, it is shown that Rainbow is
more competent than the other tools in dealing with RAD-seq data.
|
|
blasr
mapping single-molecule sequencing reads
|
Versions of package blasr |
Release | Version | Architectures |
buster | 5.3.2+dfsg-1.1 | amd64,arm64 |
sid | 5.3.5+dfsg-6 | amd64,arm64,mips64el,ppc64el,riscv64 |
stretch | 5.3+0-1 | amd64,arm64,mips64el,ppc64el |
trixie | 5.3.5+dfsg-6 | amd64,arm64,mips64el,ppc64el,riscv64 |
bookworm | 5.3.5+dfsg-6 | amd64,arm64,mips64el,ppc64el |
bullseye | 5.3.3+dfsg-5 | amd64,arm64,mips64el,ppc64el |
|
License: DFSG free
|
Basic local alignment with successive refinement (BLASR) is a method
for mapping single-molecule sequencing reads against a reference genome.
Such reads are thousands of bases long, with divergence between them
and the genome being dominated by insertion and deletion error.
|
|
bowtie
Ultrafast memory-efficient short read aligner
|
Versions of package bowtie |
Release | Version | Architectures |
bullseye | 1.3.0+dfsg1-1 | amd64,arm64,mips64el,ppc64el,s390x |
buster | 1.2.2+dfsg-4 | amd64,arm64 |
stretch | 1.1.2-6 | amd64,arm64,mips64el,ppc64el,s390x |
jessie | 1.1.1-2 | amd64 |
sid | 1.3.1-3 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
trixie | 1.3.1-3 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
bookworm | 1.3.1-1 | amd64,arm64,mips64el,ppc64el,s390x |
Debtags of package bowtie: |
biology | nuceleic-acids |
field | biology:bioinformatics |
interface | commandline |
role | program |
science | calculation |
scope | utility |
use | analysing, comparing |
works-with | biological-sequence |
|
License: DFSG free
|
This package addresses the problem to interpret the results from the
latest (2010) DNA sequencing technologies. Those will yield fairly
short stretches and those cannot be interpreted directly. It is the
challenge for tools like Bowtie to give a chromosomal location to the
short stretches of DNA sequenced per run.
Bowtie aligns short DNA sequences (reads) to the human genome at a rate
of over 25 million 35-bp reads per hour. Bowtie indexes the genome with
a Burrows-Wheeler index to keep its memory footprint small: typically
about 2.2 GB for the human genome (2.9 GB for paired-end).
|
|
bowtie2
ultrafast memory-efficient short read aligner
|
Versions of package bowtie2 |
Release | Version | Architectures |
sid | 2.5.4-1 | amd64,arm64,mips64el,ppc64el,riscv64 |
buster | 2.3.4.3-1 | amd64 |
bullseye | 2.4.2-2 | amd64,arm64,mips64el,ppc64el |
bookworm | 2.5.0-3 | amd64,arm64,mips64el,ppc64el |
stretch | 2.3.0-2 | amd64 |
jessie | 2.2.4-1 | amd64 |
trixie | 2.5.4-1 | amd64,arm64,mips64el,ppc64el,riscv64 |
|
License: DFSG free
|
is an ultrafast and memory-efficient tool for aligning sequencing reads
to long reference sequences. It is particularly good at aligning reads
of about 50 up to 100s or 1,000s of characters, and particularly good
at aligning to relatively long (e.g. mammalian) genomes.
Bowtie 2 indexes the genome with an FM Index to keep its memory footprint
small: for the human genome, its memory footprint is typically
around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes
|
|
bwa
|
Versions of package bwa |
Release | Version | Architectures |
bullseye | 0.7.17-6 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
sid | 0.7.18-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
trixie | 0.7.18-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
jessie | 0.7.10-1 | amd64 |
bookworm | 0.7.17-7 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
stretch | 0.7.15-2+deb9u1 | amd64 |
stretch-backports | 0.7.17-1~bpo9+1 | amd64 |
buster | 0.7.17-3 | amd64 |
Debtags of package bwa: |
biology | nuceleic-acids, peptidic |
field | biology, biology:bioinformatics |
interface | commandline, text-mode |
role | program |
use | analysing, comparing |
|
License: DFSG free
|
BWA is a software package for mapping low-divergent sequences against
a large reference genome, such as the human genome. It consists of
three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. The first
algorithm is designed for Illumina sequence reads up to 100bp, while
the rest two for longer sequences ranged from 70bp to 1Mbp. BWA-MEM
and BWA-SW share similar features such as long-read support and split
alignment, but BWA-MEM, which is the latest, is generally recommended
for high-quality queries as it is faster and more accurate. BWA-MEM
also has better performance than BWA-backtrack for 70-100bp Illumina
reads.
|
|
canu
single molecule sequence assembler for genomes
|
Versions of package canu |
Release | Version | Architectures |
buster | 1.8+dfsg-2 | amd64 |
bullseye | 2.0+dfsg-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
sid | 2.2+dfsg-5 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
stretch-backports | 1.7.1+dfsg-1~bpo9+1 | amd64 |
bookworm | 2.0+dfsg-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
|
License: DFSG free
|
Canu is a fork of the Celera Assembler, designed for high-noise
single-molecule sequencing (such as the PacBio RS II or Oxford
Nanopore MinION).
Canu is a hierarchical assembly pipeline which runs in four steps:
- Detect overlaps in high-noise sequences using MHAP
- Generate corrected sequence consensus
- Trim corrected sequences
- Assemble trimmed corrected sequences
|
|
changeo
Repertoire clonal assignment toolkit (Python 3)
|
Versions of package changeo |
Release | Version | Architectures |
buster | 0.4.5-1 | all |
bullseye | 1.0.2-1 | all |
trixie | 1.3.0-2 | all |
sid | 1.3.0-2 | all |
bookworm | 1.3.0-1 | all |
|
License: DFSG free
|
Change-O is a collection of tools for processing the output of V(D)J
alignment tools, assigning clonal clusters to immunoglobulin (Ig)
sequences, and reconstructing germline sequences.
Dramatic improvements in high-throughput sequencing technologies now
enable large-scale characterization of Ig repertoires, defined as the
collection of trans-membrane antigen-receptor proteins located on the
surface of B cells and T cells. Change-O is a suite of utilities to
facilitate advanced analysis of Ig and TCR sequences following germline
segment assignment. Change-O handles output from IMGT/HighV-QUEST
and IgBLAST, and provides a wide variety of clustering methods for
assigning clonal groups to Ig sequences. Record sorting, grouping,
and various database manipulation operations are also included.
This package installs the library for Python 3.
Please cite:
Namita T. Gupta, Jason A. Vander Heiden, Mohamed Uduman, Daniel Gadala-Maria, Gur Yaari and Steven H. Kleinstein:
Link
to publication
(PubMed,eprint)
Bioinformatics
31(20):3356-3358
(2015)
|
|
crac
integrated RNA-Seq read analysis
|
Versions of package crac |
Release | Version | Architectures |
trixie | 2.5.2+dfsg-6 | amd64,arm64,mips64el,ppc64el,riscv64 |
sid | 2.5.2+dfsg-6 | amd64,arm64,mips64el,ppc64el,riscv64 |
bookworm | 2.5.2+dfsg-5 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el |
bullseye | 2.5.2+dfsg-4 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el |
stretch | 2.5.0+dfsg-1 | amd64 |
buster | 2.5.0+dfsg-3 | amd64,arm64 |
|
License: DFSG free
|
CRAC is a tool to analyze High Throughput Sequencing (HTS) data in
comparison to a reference genome. It is intended for transcriptomic
and genomic sequencing reads. More precisely, with transcriptomic
reads as input, it predicts point mutations, indels, splice junction,
and chimeric RNAs (ie, non colinear splice junctions). CRAC can also
output positions and nature of sequence error that it detects in the
reads. CRAC uses a genome index. This index must be computed before
running the read analysis. For this sake, use the command "crac-index"
on your genome files. You can then process the reads using the command
crac. See the man page of CRAC (help file) by typing "man crac". CRAC
requires large amount of main memory on your computer. For processing
against the Human genome, say 50 million reads of 100 nucleotide each,
CRAC requires about 40 gigabytes of main memory. Check whether the
system of your computing server is equipped with sufficient amount of
memory before launching an analysis.
|
|
cutadapt
Clean biological sequences from high-throughput sequencing reads
|
Versions of package cutadapt |
Release | Version | Architectures |
bullseye | 3.2-2 | all |
sid | 4.7-2 | all |
trixie | 4.7-2 | all |
bookworm | 4.2-1 | all |
buster | 1.18-1 | all |
stretch | 1.12-2 | all |
upstream | 4.9 |
|
License: DFSG free
|
Cutadapt helps with biological sequence clean tasks by finding the adapter
or primer sequences in an error-tolerant way.
It can also modify and filter reads in various ways.
Adapter sequences can contain IUPAC wildcard characters.
Also, paired-end reads and even colorspace data is supported.
If you want, you can also just demultiplex your input data, without removing
adapter sequences at all.
This package contains the user interface.
The package is enhanced by the following packages:
multiqc
|
|
daligner
local alignment discovery between long nucleotide sequencing reads
|
Versions of package daligner |
Release | Version | Architectures |
buster | 1.0+git20180524.fd21879-1 | amd64,arm64,armhf,i386 |
sid | 1.0+git20240119.335105d-3 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bullseye | 1.0+git20200727.ed40ce5-3 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
stretch | 1.0+20161119-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
bookworm | 1.0+git20221215.bd26967-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
trixie | 1.0+git20240119.335105d-3 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
|
License: DFSG free
|
These tools permit one to find all significant local alignments between
reads encoded in a Dazzler database. The assumption is that the reads are
from a Pacific Biosciences RS II long read sequencer. That is, the reads
are long and noisy, up to 15% on average.
|
|
deepnano
alternative basecaller for MinION reads of genomic sequences
|
Versions of package deepnano |
Release | Version | Architectures |
bullseye | 0.0+git20170813.e8a621e-3.1 | amd64,arm64,armhf,i386,ppc64el,s390x |
buster | 0.0+git20170813.e8a621e-3 | amd64,arm64,i386 |
|
License: DFSG free
|
DeepNano is alternative basecaller for Oxford Nanopore MinION reads
based on deep recurrent neural networks.
Currently it works with SQK-MAP-006 and SQK-MAP-005 chemistry and as a
postprocessor for Metrichor.
|
|
discosnp
discovering Single Nucleotide Polymorphism from raw set(s) of reads
|
Versions of package discosnp |
Release | Version | Architectures |
bullseye | 4.4.4-1 | amd64,arm64,i386,mips64el,ppc64el,s390x |
trixie | 2.6.2-3 | amd64,arm64,mips64el,ppc64el,riscv64 |
jessie | 1.2.5-1 | amd64,armel,armhf,i386 |
stretch | 1.2.6-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
buster | 2.3.0-2 | amd64,arm64,i386 |
bookworm | 2.6.2-2 | amd64,arm64,mips64el,ppc64el |
sid | 2.6.2-3 | amd64,arm64,mips64el,ppc64el,riscv64 |
|
License: DFSG free
|
Software discoSnp is designed for discovering Single Nucleotide
Polymorphism (SNP) from raw set(s) of reads obtained with Next Generation
Sequencers (NGS).
Note that number of input read sets is not constrained, it can be one, two,
or more. Note also that no other data as reference genome or annotations
are needed.
The software is composed by two modules. First module, kissnp2, detects SNPs
from read sets. A second module, kissreads, enhance the kissnp2 results by
computing per read set and for each found SNP:
1) its mean read coverage
2) the (phred) quality of reads generating the polymorphism.
This program is superseded by DiscoSnp++.
|
|
dnaclust
tool for clustering millions of short DNA sequences
|
Versions of package dnaclust |
Release | Version | Architectures |
jessie | 3-2 | amd64,armel,armhf,i386 |
sid | 3-7 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
trixie | 3-7 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 3-7 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bullseye | 3-7 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 3-6 | amd64,arm64,armhf,i386 |
stretch | 3-4 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
|
License: DFSG free
|
dnaclust is a tool for clustering large number of short DNA sequences.
The clusters are created in such a way that the "radius" of each
clusters is no more than the specified threshold.
The input sequences to be clustered should be in Fasta format. The id
of each sequence is based on the first word of the seqeunce in the Fasta
format. The first word is the prefix of the header up to the first
occurrence of white space characters in the header.
|
|
dwgsim
short sequencing read simulator
|
Versions of package dwgsim |
Release | Version | Architectures |
trixie | 0.1.14-3 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
sid | 0.1.14-3 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 0.1.14-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bullseye | 0.1.12-4 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 0.1.12-2 | amd64,arm64,armhf |
stretch | 0.1.11-3 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
|
License: DFSG free
|
DWGSIM simulates short sequencing reads from modern sequencing platforms.
DWGSIM generates base error rates using a parametric model, allowing a more
realisic error profile. It was originally developed for use in evaluating
short read aligners.
|
|
ea-utils
command-line tools for processing biological sequencing data
|
Versions of package ea-utils |
Release | Version | Architectures |
buster | 1.1.2+dfsg-5 | amd64,arm64,armhf,i386 |
sid | 1.1.2+dfsg-9 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
trixie | 1.1.2+dfsg-9 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 1.1.2+dfsg-9 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bullseye | 1.1.2+dfsg-6 | amd64,arm64,armhf,i386,mips64el,mipsel,ppc64el,s390x |
stretch | 1.1.2+dfsg-4 | amd64,arm64,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
|
License: DFSG free
|
Ea-utils provides a set of command-line tools for processing biological
sequencing data, barcode demultiplexing, adapter trimming, etc.
Primarily written to support an Illumina based pipeline - but should work with
any FASTQs.
Main Tools are:
-
fastq-mcf
Scans a sequence file for adapters, and, based on a log-scaled threshold,
determines a set of clipping parameters and performs clipping. Also does
skewing detection and quality filtering.
-
fastq-multx
Demultiplexes a fastq. Capable of auto-determining barcode id's based on a
master set fields. Keeps multiple reads in-sync during demultiplexing. Can
verify that the reads are in-sync as well, and fail if they're not.
-
fastq-join
Similar to audy's stitch program, but in C, more efficient and supports some
automatic benchmarking and tuning. It uses the same "squared distance for
anchored alignment" as other tools.
-
varcall
Takes a pileup and calculates variants in a more easily parameterized manner
than some other tools.
|
|
fastaq
FASTA and FASTQ file manipulation tools
|
Versions of package fastaq |
Release | Version | Architectures |
sid | 3.17.0-8 | all |
jessie | 1.5.0-1 | all |
stretch | 3.14.0-1 | all |
buster | 3.17.0-2 | all |
bullseye | 3.17.0-3 | all |
bookworm | 3.17.0-5 | all |
trixie | 3.17.0-8 | all |
|
License: DFSG free
|
Fastaq represents a diverse collection of scripts that perform useful and
common FASTA/FASTQ manipulation tasks, such as filtering, merging, splitting,
sorting, trimming, search/replace, etc. Input and output files can be gzipped
(format is automatically detected) and individual Fastaq commands can be piped
together.
|
|
fastp
Ultra-fast all-in-one FASTQ preprocessor
|
Versions of package fastp |
Release | Version | Architectures |
bookworm | 0.23.2+dfsg-2 | amd64,arm64,armel,armhf,mips64el,mipsel,ppc64el,s390x |
sid | 0.23.4+dfsg-1 | amd64,arm64,armel,armhf,mips64el,ppc64el,riscv64,s390x |
trixie | 0.23.4+dfsg-1 | amd64,arm64,armel,armhf,mips64el,ppc64el,riscv64,s390x |
buster | 0.19.6+dfsg-1 | amd64,arm64,armhf,i386 |
bullseye | 0.20.1+dfsg-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
upstream | 0.24.0 |
|
License: DFSG free
|
All-in-one FASTQ preprocessor, fastp provides functions including quality
profiling, adapter trimming, read filtering and base correction. It supports
both single-end and paired-end short read data and also provides basic support
for long-read data.
The package is enhanced by the following packages:
multiqc
|
|
fastqc
quality control for high throughput sequence data
|
Versions of package fastqc |
Release | Version | Architectures |
jessie | 0.11.2+dfsg-3 | all |
bookworm | 0.11.9+dfsg-6 | all |
trixie | 0.12.1+dfsg-4 | all |
sid | 0.12.1+dfsg-4 | all |
bullseye | 0.11.9+dfsg-4 | all |
buster | 0.11.8+dfsg-2 | all |
stretch | 0.11.5+dfsg-6 | all |
|
License: DFSG free
|
FastQC aims to provide a simple way to do some quality control checks on
raw sequence data coming from high throughput sequencing pipelines. It
provides a modular set of analyses which you can use to give a quick
impression of whether your data has any problems of which you should
be aware before doing any further analysis.
The main functions of FastQC are
- Import of data from BAM, SAM or FastQ files (any variant)
- Providing a quick overview to tell you in which areas there may
be problems
- Summary graphs and tables to quickly assess your data
- Export of results to an HTML based permanent report
- Offline operation to allow automated generation of reports without
running the interactive application
The package is enhanced by the following packages:
multiqc
|
|
flexbar
flexible barcode and adapter removal for sequencing platforms
|
Versions of package flexbar |
Release | Version | Architectures |
experimental | 3.5.0-6 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
sid | 3.5.0-5 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
jessie | 2.50-1 | amd64,armhf,i386 |
stretch | 2.50-2 | amd64,arm64,armhf,i386,mips,mips64el,mipsel,ppc64el |
buster | 3.4.0-2 | amd64,arm64,armhf,i386 |
bullseye | 3.5.0-3 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bookworm | 3.5.0-5 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
trixie | 3.5.0-5 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
|
License: DFSG free
|
Flexbar preprocesses high-throughput sequencing data efficiently. It
demultiplexes barcoded runs and removes adapter sequences. Moreover,
trimming and filtering features are provided. Flexbar increases mapping
rates and improves genome and transcriptome assemblies. It supports
next-generation sequencing data in fasta/q and csfasta/q format from
Illumina, Roche 454, and the SOLiD platform.
Parameter names changed in Flexbar. Please review scripts. The recent
months, default settings were optimised, several bugs were fixed and
various improvements were made, e.g. revamped command-line interface,
new trimming modes as well as lower time and memory requirements.
The package is enhanced by the following packages:
multiqc
|
|
fml-asm
tool for assembling Illumina short reads in small regions
|
Versions of package fml-asm |
Release | Version | Architectures |
sid | 0.1+git20190320.b499514-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
stretch | 0.1-2 | amd64 |
stretch-backports | 0.1-4~bpo9+1 | amd64 |
buster | 0.1-5 | amd64 |
bullseye | 0.1+git20190320.b499514-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bookworm | 0.1+git20190320.b499514-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
trixie | 0.1+git20190320.b499514-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
experimental | 0.1+git20190320.b499514-2~0exp | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
upstream | 0.1+git20221215.85f159e |
|
License: DFSG free
|
Fml-asm is a command-line tool for assembling Illumina short reads in regions
from 100bp to 10 million bp in size, based on the fermi-lite library.
It is largely a light-weight in-memory version of fermikit without
generating any intermediate files. It inherits the performance, the relatively
small memory footprint and the features of fermikit. In particular, fermi-lite
is able to retain heterozygous events and thus can be used to assemble diploid
regions for the purpose of variant calling.
|
|
fsm-lite
frequency-based string mining (lite)
|
Versions of package fsm-lite |
Release | Version | Architectures |
buster | 1.0-3 | amd64,arm64 |
bookworm | 1.0-8 | amd64,arm64,mips64el,ppc64el,s390x |
trixie | 1.0-8 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
sid | 1.0-8 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
stretch | 1.0-2 | amd64,arm64,mips64el,ppc64el,s390x |
bullseye | 1.0-5 | amd64,arm64,mips64el,ppc64el,s390x |
|
License: DFSG free
|
A singe-core implementation of frequency-based substring mining used in
bioinformatics to extract substrings that discriminate two (or more)
datasets inside high-throughput sequencing data.
|
|
giira
RNA-Seq driven gene finding incorporating ambiguous reads
|
Versions of package giira |
Release | Version | Architectures |
stretch | 0.0.20140625-1 | amd64 |
buster | 0.0.20140625-2 | amd64 |
jessie | 0.0.20140210-2 | amd64 |
|
License: DFSG free
|
GIIRA is a gene prediction method that identifies potential coding
regions exclusively based on the mapping of reads from an RNA-Seq
experiment. It was foremost designed for prokaryotic gene prediction
and is able to resolve genes within the expressed region of an operon.
However, it is also applicable to eukaryotes and predicts exon intron
structures as well as alternative isoforms.
|
|
grinder
Versatile omics shotgun and amplicon sequencing read simulator
|
Versions of package grinder |
Release | Version | Architectures |
sid | 0.5.4-6 | all |
stretch | 0.5.4-1 | all |
jessie | 0.5.3-3 | all |
buster | 0.5.4-5 | all |
trixie | 0.5.4-6 | all |
bullseye | 0.5.4-6 | all |
bookworm | 0.5.4-6 | all |
|
License: DFSG free
|
Grinder is a versatile program to create random shotgun and amplicon sequence
libraries based on DNA, RNA or proteic reference sequences provided in a
FASTA file.
Grinder can produce genomic, metagenomic, transcriptomic, metatranscriptomic,
proteomic, metaproteomic shotgun and amplicon datasets from current
sequencing technologies such as Sanger, 454, Illumina. These simulated
datasets can be used to test the accuracy of bioinformatic tools under
specific hypothesis, e.g. with or without sequencing errors, or with low or
high community diversity. Grinder may also be used to help decide between
alternative sequencing methods for a sequence-based project, e.g. should the
library be paired-end or not, how many reads should be sequenced.
|
|
hilive
realtime alignment of Illumina reads
|
Versions of package hilive |
Release | Version | Architectures |
experimental | 2.0a-5 | arm64,armel,armhf,ppc64el,riscv64 |
bookworm | 2.0a-3 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 1.1-2 | amd64,arm64,armhf |
bullseye | 2.0a-3 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
stretch | 0.3-2 | amd64,arm64,armel,i386,mips64el,mipsel,ppc64el |
sid | 2.0a-4 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
|
License: DFSG free
|
HiLive is a read mapping tool that maps Illumina HiSeq (or comparable)
reads to a reference genome right in the moment when they are produced.
This means, read mapping is finished as soon as the sequencer is
finished generating the data.
|
|
hinge
long read genome assembler based on hinging
|
Versions of package hinge |
Release | Version | Architectures |
bookworm | 0.5.0-7 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el |
trixie | 0.5.0-7 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64 |
sid | 0.5.0-7 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64 |
buster | 0.5.0-4 | amd64,arm64 |
bullseye | 0.5.0-6 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el |
|
License: DFSG free
|
HINGE is a genome assembler that seeks to achieve optimal repeat resolution
by distinguishing repeats that can be resolved given the data from those that
cannot. This is accomplished by adding “hinges” to reads for constructing an
overlap graph where only unresolvable repeats are merged. As a result, HINGE
combines the error resilience of overlap-based assemblers with
repeat-resolution capabilities of de Bruijn graph assemblers.
|
|
hisat2
graph-based alignment of short nucleotide reads to many genomes
|
Versions of package hisat2 |
Release | Version | Architectures |
sid | 2.2.1-5 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
buster | 2.1.0-2 | amd64 |
bullseye | 2.2.1-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
stretch | 2.0.5-1 | amd64 |
bookworm | 2.2.1-4 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
trixie | 2.2.1-5 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
|
License: DFSG free
|
HISAT2 is a fast and sensitive alignment program for mapping next-generation
sequencing reads (both DNA and RNA) to a population of human genomes (as well
as against a single reference genome). Based on an extension of BWT for graphs
a graph FM index (GFM) was designed and implementd. In addition to using
one global GFM index that represents a population of human genomes, HISAT2
uses a large set of small GFM indexes that collectively cover the whole genome
(each index representing a genomic region of 56 Kbp, with 55,000 indexes
needed to cover the human population). These small indexes (called local
indexes), combined with several alignment strategies, enable rapid and
accurate alignment of sequencing reads. This new indexing scheme is called a
Hierarchical Graph FM index (HGFM).
The package is enhanced by the following packages:
multiqc
|
|
idba
iterative De Bruijn Graph short read assemblers
|
Versions of package idba |
Release | Version | Architectures |
trixie | 1.1.3-8 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
jessie | 1.1.2-1 | amd64,armel,armhf,i386 |
stretch | 1.1.3-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
buster | 1.1.3-3 | amd64,arm64,armhf,i386 |
bullseye | 1.1.3-7 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bookworm | 1.1.3-8 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
sid | 1.1.3-8 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
|
License: DFSG free
|
IDBA stands for iterative de Bruijn graph assembler. In computational
sequence biology, an assembler solves the puzzle coming from large
sequencing machines that feature many gigabytes of short reads from a
large genome.
This package provides several flavours of the IDBA assembler, as they all
share the same source tree but serve different purposes and evolved over time.
IDBA is the basic iterative de Bruijn graph assembler for
second-generation sequencing reads. IDBA-UD, an extension of IDBA,
is designed to utilize paired-end reads to assemble low-depth regions
and use progressive depth on contigs to reduce errors in high-depth
regions. It is a generic purpose assembler and especially good for
single-cell and metagenomic sequencing data. IDBA-Hybrid is another
update version of IDBA-UD, which can make use of a similar reference
genome to improve assembly result. IDBA-Tran is an iterative de Bruijn
graph assembler for RNA-Seq data.
|
|
igdiscover
analyzes antibody repertoires to find new V genes
|
Versions of package igdiscover |
Release | Version | Architectures |
bullseye | 0.11-3 | all |
sid | 0.11-4 | all |
upstream | 0.15.1 |
|
License: DFSG free
|
IgDiscover analyzes antibody repertoires and discovers new V genes from
high-throughput sequencing reads. Heavy chains, kappa and lambda light
chains are supported (to discover VH, VK and VL genes).
|
|
igor
infers V(D)J recombination processes from sequencing data
|
Versions of package igor |
Release | Version | Architectures |
trixie | 1.4.0+dfsg-5 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
buster | 1.3.0+dfsg-1 | amd64,arm64,armhf,i386 |
bullseye | 1.4.0+dfsg-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bookworm | 1.4.0+dfsg-4 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
sid | 1.4.0+dfsg-5 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
|
License: DFSG free
|
IGoR (Inference and Generation of Repertoires) is a versatile software
to analyze and model immune receptors generation, selection, mutation
and all other processes.
|
|
igv
Integrative Genomics Viewer
|
Versions of package igv |
Release | Version | Architectures |
jessie | 2.3.38+dfsg-1 (non-free) | all |
stretch | 2.3.90+dfsg-1 (non-free) | all |
sid | 2.18.5+dfsg-1 | all |
bullseye | 2.6.3+dfsg-3 (non-free) | all |
trixie | 2.18.5+dfsg-1 | all |
bookworm | 2.16.0+dfsg-1 | all |
Debtags of package igv: |
field | biology |
interface | x11 |
network | client |
role | program |
scope | utility |
use | viewing |
works-with | biological-sequence |
|
License: DFSG free
|
The Integrative Genomics Viewer (IGV) is a high-performance viewer that
efficiently handles large heterogeneous data sets, while providing a
smooth and intuitive user experience at all levels of genome resolution.
A key characteristic of IGV is its focus on the integrative nature of
genomic studies, with support for both array-based and next-generation
sequencing data, and the integration of clinical and phenotypic data.
Although IGV is often used to view genomic data from public sources,
its primary emphasis is to support researchers who wish to visualize and
explore their own data sets or those from colleagues. To that end, IGV
supports flexible loading of local and remote data sets, and is
optimized to provide high-performance data visualization and exploration
on standard desktop systems.
Please cite:
James T Robinson, Helga Thorvaldsdóttir, Wendy Winckler, Mitchell Guttman, Eric S Lander, Gad Getz and Jill P Mesirov:
Integrative genomics viewer.
(PubMed,eprint)
Nature Biotechnology
29(1):24–26
(2011)
|
|
iva
iterative virus sequence assembler
|
Versions of package iva |
Release | Version | Architectures |
trixie | 1.0.11+ds-5 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64 |
stretch | 1.0.8+ds-1 | amd64,arm64,mips64el,ppc64el |
buster | 1.0.9+ds-6 | amd64,arm64 |
bullseye | 1.0.9+ds-11 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el |
bookworm | 1.0.11+ds-3 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el |
sid | 1.0.11+ds-5 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64 |
|
License: DFSG free
|
IVA is a de novo assembler designed to assemble
virus genomes that have no repeat sequences,
using Illumina read pairs sequenced from mixed
populations at extremely high depth.
IVA's main algorithm works by iteratively extending
contigs using aligned read pairs. Its input can be
just read pairs, or additionally you can provide an
existing set of contigs to be extended. Alternatively,
it can take reads together with a reference sequence.
|
|
khmer
in-memory DNA sequence kmer counting, filtering & graph traversal
|
Versions of package khmer |
Release | Version | Architectures |
bullseye | 2.1.2+dfsg-8 | amd64,arm64 |
experimental | 3.0.0~a3+dfsg-9~0exp | amd64 |
sid | 3.0.0~a3+dfsg-8 | amd64 |
stretch | 2.0+dfsg-10 | amd64,arm64,mips64el,ppc64el |
buster | 2.1.2+dfsg-6 | amd64,arm64 |
bookworm | 3.0.0~a3+dfsg-4 | amd64 |
|
License: DFSG free
|
khmer is a library and suite of command line tools for working with DNA
sequence. It is primarily aimed at short-read sequencing data such as that
produced by the Illumina platform. khmer takes a k-mer-centric approach to
sequence analysis, hence the name.
Please cite:
Michael R. Crusoe, Hussien F. Alameldin, Sherine Awad, Elmar Bucher, Adam Caldwell, Reed Cartwright, Amanda Charbonneau, Bede Constantinides, Greg Edvenson, Scott Fay, Jacob Fenton, Thomas Fenzl, Jordan Fish, Leonor Garcia-Gutierrez, Phillip Garland, Jonathan Gluck, Iván González, Sarah Guermond, Jiarong Guo, Aditi Gupta, Joshua R. Herr, Adina Howe, Alex Hyer, Andreas Härpfer, Luiz Irber, Rhys Kidd, David Lin, Justin Lippi, Tamer Mansour, Pamela McA'Nulty, Eric McDonald, Jessica Mizzi, Kevin D. Murray, Joshua R. Nahum, Kaben Nanlohy, Alexander Johan Nederbragt, Humberto Ortiz-Zuazaga, Jeramia Ory, Jason Pell, Charles Pepe-Ranney, Zachary N Russ, Erich Schwarz, Camille Scott, Josiah Seaman, Scott Sievert, Jared Simpson, Connor T. Skennerton, James Spencer, Ramakrishnan Srinivasan, Daniel Standage, James A. Stapleton, Joe Stein, Susan R Steinman, Benjamin Taylor, Will Trimble, Heather L. Wiencko, Michael Wright, Brian Wyss, Qingpeng Zhang, en zyme and C. Titus Brown:
The khmer software package: enabling efficient sequence analysis.
(2015)
|
|
kissplice
Detection of various kinds of polymorphisms in RNA-seq data
|
Versions of package kissplice |
Release | Version | Architectures |
sid | 2.6.7-1 | amd64,arm64,mips64el,ppc64el,riscv64 |
jessie | 2.2.1-3 | amd64 |
stretch | 2.4.0-p1-1 | amd64,arm64,mips64el,ppc64el |
buster | 2.4.0-p1-4 | amd64,arm64 |
bullseye | 2.5.3-3 | amd64,arm64,mips64el,ppc64el |
bookworm | 2.6.2-2 | amd64,arm64,mips64el,ppc64el |
trixie | 2.6.7-1 | amd64,arm64,mips64el,ppc64el,riscv64 |
Debtags of package kissplice: |
biology | nuceleic-acids |
field | biology, biology:bioinformatics |
interface | commandline |
role | program |
use | analysing |
works-with | biological-sequence |
|
License: DFSG free
|
KisSplice is a piece of software that enables the analysis of RNA-seq data
with or without a reference genome. It is an exact local transcriptome
assembler that allows one to identify SNPs, indels and alternative splicing
events. It can deal with an arbitrary number of biological conditions, and
will quantify each variant in each condition.
It has been tested on Illumina datasets of up to 1G reads.
Its memory consumption is around 5Gb for 100M reads.
Topics: RNA-seq; RNA splicing; Gene structure
|
|
kraken
assigning taxonomic labels to short DNA sequences
|
Versions of package kraken |
Release | Version | Architectures |
bookworm | 1.1.1-4 | amd64,arm64,mips64el,ppc64el |
sid | 1.1.1-4 | amd64,arm64,mips64el,ppc64el,riscv64 |
trixie | 1.1.1-4 | amd64,arm64,mips64el,ppc64el,riscv64 |
stretch-backports | 1.1-2~bpo9+1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x |
buster | 1.1-3 | amd64,arm64,armhf,i386 |
stretch | 0.10.5~beta-2 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
bullseye | 1.1.1-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x |
|
License: DFSG free
|
Kraken is a system for assigning taxonomic labels to short DNA
sequences, usually obtained through metagenomic studies. Previous
attempts by other bioinformatics software to accomplish this task have
often used sequence alignment or machine learning techniques that were
quite slow, leading to the development of less sensitive but much faster
abundance estimation programs. Kraken aims to achieve high sensitivity
and high speed by utilizing exact alignments of k-mers and a novel
classification algorithm.
In its fastest mode of operation, for a simulated metagenome of 100 bp
reads, Kraken processed over 4 million reads per minute on a single
core, over 900 times faster than Megablast and over 11 times faster than
the abundance estimation program MetaPhlAn. Kraken's accuracy is
comparable with Megablast, with slightly lower sensitivity and very high
precision.
|
|
kraken2
taxonomic classification system using exact k-mer matches
|
Versions of package kraken2 |
Release | Version | Architectures |
trixie | 2.1.3-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
sid | 2.1.3-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bullseye | 2.1.1-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bookworm | 2.1.2-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
|
License: DFSG free
|
Kraken 2 is the newest version of Kraken, a taxonomic classification
system using exact k-mer matches to achieve high accuracy and fast
classification speeds. This classifier matches each k-mer within a query
sequence to the lowest common ancestor (LCA) of all genomes containing
the given k-mer. The k-mer assignments inform the classification
algorithm. [see: Kraken 1's Webpage for more details].
Kraken 2 provides significant improvements to Kraken 1, with faster
database build times, smaller database sizes, and faster classification
speeds. These improvements were achieved by the following updates to the
Kraken classification program:
1. Storage of Minimizers: Instead of storing/querying entire k-mers,
Kraken 2 stores minimizers (l-mers) of each k-mer. The length of
each l-mer must be ≤ the k-mer length. Each k-mer is treated by
Kraken 2 as if its LCA is the same as its minimizer's LCA.
2. Introduction of Spaced Seeds: Kraken 2 also uses spaced seeds to
store and query minimizers to improve classification accuracy.
3. Database Structure: While Kraken 1 saved an indexed and sorted list
of k-mer/LCA pairs, Kraken 2 uses a compact hash table. This hash
table is a probabilistic data structure that allows for faster
queries and lower memory requirements. However, this data structure
does have a <1% chance of returning the incorrect LCA or returning
an LCA for a non-inserted minimizer. Users can compensate for this
possibility by using Kraken's confidence scoring thresholds.
4. Protein Databases: Kraken 2 allows for databases built from amino
acid sequences. When queried, Kraken 2 performs a six-frame
translated search of the query sequences against the database.
5. 16S Databases: Kraken 2 also provides support for databases not
based on NCBI's taxonomy. Currently, these include the 16S
databases: Greengenes, SILVA, and RDP.
|
|
last-align
genome-scale comparison of biological sequences
|
Versions of package last-align |
Release | Version | Architectures |
trixie | 1542-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
buster | 963-2 | amd64,arm64,armhf,i386 |
bullseye | 1179-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bookworm | 1447-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
sid | 1542-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
jessie | 490-1 | amd64,armel,armhf,i386 |
stretch | 830-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
Debtags of package last-align: |
biology | nuceleic-acids |
field | biology, biology:bioinformatics |
role | program |
|
License: DFSG free
|
LAST is software for comparing and aligning sequences, typically DNA or
protein sequences. LAST is similar to BLAST, but it copes better with very
large amounts of sequence data. Here are two things LAST is good at:
- Comparing large (e.g. mammalian) genomes.
- Mapping lots of sequence tags onto a genome.
The main technical innovation is that LAST finds initial matches based on
their multiplicity, instead of using a fixed size (e.g. BLAST uses 10-mers).
This allows one to map tags to genomes without repeat-masking, without becoming
overwhelmed by repetitive hits. To find these variable-sized matches, it uses
a suffix array (inspired by Vmatch). To achieve high sensitivity, it uses a
discontiguous suffix array, analogous to spaced seeds.
|
|
libvcflib-tools
C++ library for parsing and manipulating VCF files (tools)
|
Versions of package libvcflib-tools |
Release | Version | Architectures |
stretch | 1.0.0~rc1+dfsg1-3 | amd64 |
sid | 1.0.9+dfsg1-3 | amd64,arm64,mips64el,ppc64el,riscv64 |
buster | 1.0.0~rc2+dfsg-2 | amd64 |
buster-backports | 1.0.1+dfsg-3~bpo10+1 | amd64 |
stretch-backports | 1.0.0~rc1+dfsg1-6~bpo9+1 | amd64 |
bullseye | 1.0.2+dfsg-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bookworm | 1.0.3+dfsg-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
trixie | 1.0.9+dfsg1-3 | amd64,arm64,mips64el,ppc64el,riscv64 |
upstream | 1.0.12 |
|
License: DFSG free
|
The Variant Call Format (VCF) is a flat-file, tab-delimited textual format
intended to concisely describe reference-indexed variations between
individuals. VCF provides a common interchange format for the description of
variation in individuals and populations of samples, and has become the defacto
standard reporting format for a wide array of genomic variant detectors.
vcflib provides methods to manipulate and interpret sequence variation as it
can be described by VCF. It is both:
- an API for parsing and operating on records of genomic variation as it can
be described by the VCF format,
- and a collection of command-line utilities for executing complex
manipulations on VCF files.
This package contains several tools using the library.
|
|
macs
Model-based Analysis of ChIP-Seq on short reads sequencers
|
Versions of package macs |
Release | Version | Architectures |
buster | 2.1.2.1-1 | amd64,arm64,armhf,i386 |
sid | 3.0.2-1 | amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x |
stretch | 2.1.1.20160309-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
bullseye | 2.2.7.1-3 | amd64,arm64,armel,armhf,i386,ppc64el,s390x |
bookworm | 2.2.7.1-6 | amd64,arm64,armel,armhf,i386,ppc64el,s390x |
jessie | 2.0.9.1-1 | amd64,armel,armhf,i386 |
trixie | 3.0.2-1 | amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x |
|
License: DFSG free
|
MACS empirically models the length of the sequenced ChIP fragments, which
tends to be shorter than sonication or library construction size estimates,
and uses it to improve the spatial resolution of predicted binding sites.
MACS also uses a dynamic Poisson distribution to effectively capture local
biases in the genome sequence, allowing for more sensitive and robust
prediction. MACS compares favorably to existing ChIP-Seq peak-finding
algorithms, is publicly available open source, and can be used for ChIP-Seq
with or without control samples.
Please cite:
Yong Zhang, Tao Liu, Clifford A Meyer, Jérôme Eeckhoute, David S. Johnson, Bradley E. Bernstein, Chad Nussbaum, Richard M. Myers, Myles Brown, Wei Li and X Shirley Liu:
Model-based Analysis of ChIP-Seq (MACS).
(PubMed,eprint)
Genome Biol.
9(9):R137
(2008)
|
|
mapdamage
tracking and quantifying damage patterns in ancient DNA sequences
|
Versions of package mapdamage |
Release | Version | Architectures |
bullseye | 2.2.1+dfsg-1 | all |
buster | 2.0.9+dfsg-1 | all |
sid | 2.2.2+dfsg-1 | all |
stretch | 2.0.6+dfsg-2 | all |
trixie | 2.2.2+dfsg-1 | all |
bookworm | 2.2.1+dfsg-3 | all |
|
License: DFSG free
|
MapDamage is a computational framework written in Python and R, which
tracks and quantifies DNA damage patterns among ancient DNA sequencing
reads generated by Next-Generation Sequencing platforms.
MapDamage is developed at the Centre for GeoGenetics by the
Orlando Group.
|
|
mapsembler2
bioinformatics targeted assembly software
|
Versions of package mapsembler2 |
Release | Version | Architectures |
bookworm | 2.2.4+dfsg1-4 | amd64,arm64,ppc64el,s390x |
trixie | 2.2.4+dfsg1-4 | amd64,arm64,ppc64el,s390x |
sid | 2.2.4+dfsg1-4 | amd64,arm64,ppc64el,s390x |
buster | 2.2.4+dfsg-3 | amd64,arm64,armhf,i386 |
stretch | 2.2.3+dfsg-3 | amd64,arm64,armel,armhf,i386,ppc64el,s390x |
jessie | 2.1.6+dfsg-1 | amd64,armel,armhf,i386 |
bullseye | 2.2.4+dfsg1-3 | amd64,arm64,ppc64el,s390x |
|
License: DFSG free
|
Mapsembler2 is a targeted assembly software.
It takes as input a set of NGS raw reads (fasta or fastq, gzipped or not)
and a set of input sequences (starters).
It first determines if each starter is read-coherent, e.g. whether reads
confirm the presence of each starter in the original sequence.
Then for each read-coherent starter, Mapsembler2 outputs its sequence
neighborhood as a linear sequence or as a graph, depending on the user choice.
Mapsembler2 may be used for (not limited to):
- Validate an assembled sequence (input as starter), e.g. from a de
Bruijn graph assembly where read-coherence was not enforced.
- Checks if a gene (input as starter) has an homolog in a set of reads
- Checks if a known enzyme is present in a metagenomic NGS read set.
- Enrich unmappable reads by extending them, possibly making them mappable
- Checks what happens at the extremities of a contig
- Remove contaminants or symbiont reads from a read set
|
|
maq
maps short fixed-length polymorphic DNA sequence reads to reference sequences
|
Versions of package maq |
Release | Version | Architectures |
jessie | 0.7.1-5 | amd64,armel,armhf,i386 |
sid | 0.7.1-10 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
trixie | 0.7.1-10 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 0.7.1-9 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bullseye | 0.7.1-9 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 0.7.1-8 | amd64,arm64,armhf,i386 |
stretch | 0.7.1-7 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
Debtags of package maq: |
biology | nuceleic-acids |
field | biology, biology:bioinformatics |
interface | commandline |
role | program |
scope | utility |
use | analysing, comparing, searching |
works-with-format | plaintext |
|
License: DFSG free
|
Maq (short for Mapping and Assembly with Quality) builds mapping assemblies
from short reads generated by the next-generation sequencing machines. It was
particularly designed for Illumina-Solexa 1G Genetic Analyzer, and has a
preliminary functionality to handle ABI SOLiD data. Maq is previously known as
mapass2.
Developmemt of Maq stopped in 2008. Its successors are BWA and SAMtools.
|
|
maqview
graphical read alignment viewer for short gene sequences
|
Versions of package maqview |
Release | Version | Architectures |
sid | 0.2.5-12 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
trixie | 0.2.5-12 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 0.2.5-11 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bullseye | 0.2.5-10 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
jessie | 0.2.5-6 | amd64,armel,armhf,i386 |
stretch | 0.2.5-7 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
buster | 0.2.5-9 | amd64,arm64,armhf,i386 |
|
License: DFSG free
|
Maqview is graphical read alignment viewer. It is specifically designed
for the Maq alignment file and allows you to see the mismatches, base
qualities and mapping qualities. Maqview is nothing fancy as Consed or
GAP, but just a simple viewer for you to see what happens in a
particular region.
In comparison to tgap-maq, the text-based read alignment viewer written
by James Bonfield, Maqview is faster and takes up much less memory and
disk space in indexing. This is possibly because tgap aims to be a
general-purpose viewer but Maqview fully makes use of the fact that a
Maq alignment file has already been sorted. Maqview is also efficient in
viewing and provides a command-line tool to quickly retrieve any region
in an Maq alignment file.
|
|
mhap
locality-sensitive hashing to detect long-read overlaps
|
Versions of package mhap |
Release | Version | Architectures |
stretch-backports | 2.1.3+dfsg-1~bpo9+1 | all |
stretch | 2.1.1+dfsg-1 | all |
sid | 2.1.3+dfsg-3 | all |
trixie | 2.1.3+dfsg-3 | all |
bookworm | 2.1.3+dfsg-3 | all |
bullseye | 2.1.3+dfsg-3 | all |
buster | 2.1.3+dfsg-2 | all |
|
License: DFSG free
|
The MinHash Alignment Process (MHAP--pronounced MAP) is a
reference implementation of a probabilistic sequence overlapping
algorithm. Designed to efficiently detect all overlaps between noisy
long-read sequence data. It efficiently estimates Jaccard similarity
by compressing sequences to their representative fingerprints
composed on min-mers (minimum k-mer).
|
|
microbiomeutil
Microbiome Analysis Utilities
|
Versions of package microbiomeutil |
Release | Version | Architectures |
bullseye | 20101212+dfsg1-4 | all |
buster | 20101212+dfsg1-2 | all |
stretch | 20101212+dfsg1-1 | all |
jessie | 20101212+dfsg-1 | all |
sid | 20101212+dfsg1-6 | all |
trixie | 20101212+dfsg1-6 | all |
bookworm | 20101212+dfsg1-5 | all |
|
License: DFSG free
|
The microbiomeutil package comes with the following utilities:
- ChimeraSlayer: ChimeraSlayer for chimera detection.
- NAST-iEr: NAST-based alignment tool.
- WigeoN: A reimplementation of the Pintail 16S anomaly
detection utility
- RESOURCES: Reference 16S sequences and NAST-alignments that
the tools above leverage.
Please cite:
Brian J. Haas, Dirk Gevers, Ashlee M. Earl, Mike Feldgarden, Doyle V. Ward, Georgia Giannoukos, Dawn Ciulla, Diana Tabbaa, Sarah K. Highlander, Erica Sodergren, Barbara Methé, Todd Z. DeSantis, The Human Microbiome Consortium, Joseph F. Petrosino, Rob Knight and Bruce W. Birren:
Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons.
(PubMed,eprint)
Genome Research
21(3):494-504
(2011)
|
|
mira-assembler
Whole Genome Shotgun and EST Sequence Assembler
|
Versions of package mira-assembler |
Release | Version | Architectures |
bookworm | 4.9.6-7 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
stretch | 4.9.6-2 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
buster | 4.9.6-4 | amd64,arm64,armhf,i386 |
bullseye | 4.9.6-5 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
jessie | 4.0.2-1 | amd64,armel,armhf,i386 |
sid | 4.9.6-11 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
trixie | 4.9.6-11 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
Debtags of package mira-assembler: |
role | program |
|
License: DFSG free
|
The mira genome fragment assembler is a specialised assembler for
sequencing projects classified as 'hard' due to high number of similar
repeats. For expressed sequence tags (ESTs) transcripts, miraEST is
specialised on reconstructing pristine mRNA transcripts while
detecting and classifying single nucleotide polymorphisms (SNP)
occurring in different variations thereof.
The assembler is routinely used for such various tasks as mutation
detection in different cell types, similarity analysis of transcripts
between organisms, and pristine assembly of sequences from various
sources for oligo design in clinical microarray experiments.
The package provides the following executables:
Binaries provided:
- mira: for assembly of genome sequences
- miramem: estimating memory needed to assemble projects.
- mirabait: a "grep" like tool to select reads with kmers up to 256 bases.
- miraconvert: is a tool to convert, extract and sometimes recalculate all
kinds of data related to sequence assembly files.
|
|
mothur
sequence analysis suite for research on microbiota
|
Versions of package mothur |
Release | Version | Architectures |
stretch | 1.38.1.1-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
jessie | 1.33.3+dfsg-2 | amd64,armel,armhf,i386 |
buster | 1.41.21-1 | amd64,arm64,armhf,i386 |
bullseye | 1.44.3-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
trixie | 1.48.1-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 1.48.0-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
sid | 1.48.1-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
upstream | 1.48.2 |
Debtags of package mothur: |
role | program |
|
License: DFSG free
|
Mothur seeks to develop a single piece of open-source, expandable
software to fill the bioinformatics needs of the microbial ecology
community. It has incorporated the functionality of dotur, sons,
treeclimber, s-libshuff, unifrac, and much more. In addition to improving
the flexibility of these algorithms, a number of other features including
calculators and visualization tools were added.
Please cite:
Patrick D Schloss, Sarah L Westcott, Thomas Ryabin, Justine R Hall, Martin Hartmann, Emily B Hollister, Ryan A Lesniewski, Brian B Oakley, Donovan H Parks, Courtney J Robinson, Jason W Sahl, Blaz Stres, Gerhard G Thallinger, David J Van Horn and Carolyn F Weber:
Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities.
(PubMed)
Appl Environ Microbiol
75(23):7537-7541
(2009)
Topics: Microbial ecology
|
|
nanopolish
consensus caller for nanopore sequencing data
|
Versions of package nanopolish |
Release | Version | Architectures |
stretch | 0.5.0-1 | amd64,arm64,armel,i386,mips64el,mipsel,ppc64el |
buster | 0.11.0-2 | amd64 |
bullseye | 0.13.2-3 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bookworm | 0.14.0-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el |
stretch-backports | 0.10.2-1~bpo9+1 | amd64 |
sid | 0.14.0-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64 |
trixie | 0.14.0-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64 |
|
License: DFSG free
|
Nanopolish uses a signal-level hidden Markov model for consensus calling
of nanopore genome sequencing data. It can perform signal-level analysis
of Oxford Nanopore sequencing data. Nanopolish can calculate an improved
consensus sequence for a draft genome assembly, detect base
modifications, call SNPs and indels with respect to a reference genome
and more.
|
|
paleomix
pipelines and tools for the processing of ancient and modern HTS data
|
Versions of package paleomix |
Release | Version | Architectures |
trixie | 1.3.8-2 | amd64,arm64 |
buster | 1.2.13.3-1 | amd64 |
bullseye | 1.3.2-1 | amd64,arm64,mips64el,ppc64el |
bookworm | 1.3.7-3 | amd64,arm64 |
sid | 1.3.8-2 | amd64,arm64 |
|
License: DFSG free
|
The PALEOMIX pipelines are a set of pipelines and tools designed to aid
the rapid processing of High-Throughput Sequencing (HTS) data: The BAM
pipeline processes de-multiplexed reads from one or more samples,
through sequence processing and alignment, to generate BAM alignment
files useful in downstream analyses; the Phylogenetic pipeline carries
out genotyping and phylogenetic inference on BAM alignment files, either
produced using the BAM pipeline or generated elsewhere; and the Zonkey
pipeline carries out a suite of analyses on low coverage equine
alignments, in order to detect the presence of F1-hybrids in
archaeological assemblages. In addition, PALEOMIX aids in metagenomic
analysis of the extracts.
The pipelines have been designed with ancient DNA (aDNA) in mind, and
includes several features especially useful for the analyses of ancient
samples, but can all be for the processing of modern samples, in order
to ensure consistent data processing.
Please cite:
Mikkel Schubert, Luca Ermini, Clio Der Sarkissian, Hákon Jónsson, Aurélien Ginolhac, Robert Schaefer, Michael D Martin, Ruth Fernández, Martin Kircher, Molly McCue, Eske Willerslev and Ludovic Orlando:
Characterization of ancient and modern genomes by SNP detection and phylogenomic and metagenomic analysis using PALEOMIX.
(PubMed)
Nature Protocols
9(5):1056-82
(2014)
|
|
pbhoney
genomic structural variation discovery
|
Versions of package pbhoney |
Release | Version | Architectures |
bookworm | 15.8.24+dfsg-7 | all |
bullseye | 15.8.24+dfsg-7 | all |
stretch | 15.8.24+dfsg-2 | all |
buster | 15.8.24+dfsg-3 | all |
sid | 15.8.24+dfsg-7 | all |
trixie | 15.8.24+dfsg-7 | all |
|
License: DFSG free
|
PBHoney is an implementation of two variant-identification
approaches designed to exploit the high mappability of long reads
(i.e., greater than 10,000 bp). PBHoney considers both intra-read
discordance and soft-clipped tails of long reads to identify
structural variants.
PBHoney is part of the PBSuite.
|
|
pbjelly
genome assembly upgrading tool
|
Versions of package pbjelly |
Release | Version | Architectures |
stretch | 15.8.24+dfsg-2 | all |
bookworm | 15.8.24+dfsg-7 | all |
sid | 15.8.24+dfsg-7 | all |
bullseye | 15.8.24+dfsg-7 | all |
trixie | 15.8.24+dfsg-7 | all |
buster | 15.8.24+dfsg-3 | all |
|
License: DFSG free
|
PBJelly is a highly automated pipeline that aligns long sequencing
reads (such as PacBio RS reads or long 454 reads in fasta format)
to high-confidence draft assembles. PBJelly fills or reduces as
many captured gaps as possible to produce upgraded draft genomes.
PBJelly is part of the PBSuite.
|
|
pbsuite
software for Pacific Biosciences sequencing data
|
Versions of package pbsuite |
Release | Version | Architectures |
trixie | 15.8.24+dfsg-7 | all |
bookworm | 15.8.24+dfsg-7 | all |
stretch | 15.8.24+dfsg-2 | all |
sid | 15.8.24+dfsg-7 | all |
bullseye | 15.8.24+dfsg-7 | all |
buster | 15.8.24+dfsg-3 | all |
|
License: DFSG free
|
The PBSuite contains two projects created for analysis of
Pacific Biosciences long-read sequencing data.
- PBJelly - genome upgrading tool
- PBHoney - structural variation discovery
|
|
picard-tools
Command line tools to manipulate SAM and BAM files
|
Versions of package picard-tools |
Release | Version | Architectures |
sid | 3.1.1+dfsg-1 | all |
bullseye | 2.24.1+dfsg-1 | all |
buster | 2.18.25+dfsg-2 | amd64 |
jessie | 1.113-1 | all |
trixie | 3.1.1+dfsg-1 | all |
stretch | 2.8.1+dfsg-1 | all |
bookworm | 2.27.5+dfsg-2 | all |
upstream | 3.3.0 |
|
License: DFSG free
|
SAM (Sequence Alignment/Map) format is a generic format for storing
large nucleotide sequence alignments. Picard Tools includes these
utilities to manipulate SAM and BAM files:
AddCommentsToBam FifoBuffer
AddOrReplaceReadGroups FilterSamReads
BaitDesigner FilterVcf
BamIndexStats FixMateInformation
GatherBamFiles
BedToIntervalList GatherVcfs
BuildBamIndex GenotypeConcordance
CalculateHsMetrics IlluminaBasecallsToFastq
CalculateReadGroupChecksum IlluminaBasecallsToSam
CheckIlluminaDirectory LiftOverIntervalList
CheckTerminatorBlock LiftoverVcf
CleanSam MakeSitesOnlyVcf
CollectAlignmentSummaryMetrics MarkDuplicates
CollectBaseDistributionByCycle MarkDuplicatesWithMateCigar
CollectGcBiasMetrics MarkIlluminaAdapters
CollectHiSeqXPfFailMetrics MeanQualityByCycle
CollectIlluminaBasecallingMetrics MergeBamAlignment
CollectIlluminaLaneMetrics MergeSamFiles
CollectInsertSizeMetrics MergeVcfs
CollectJumpingLibraryMetrics NormalizeFasta
CollectMultipleMetrics PositionBasedDownsampleSam
CollectOxoGMetrics QualityScoreDistribution
CollectQualityYieldMetrics RenameSampleInVcf
CollectRawWgsMetrics ReorderSam
CollectRnaSeqMetrics ReplaceSamHeader
CollectRrbsMetrics RevertOriginalBaseQualitiesAndAddMateCigar
CollectSequencingArtifactMetrics RevertSam
CollectTargetedPcrMetrics SamFormatConverter
CollectVariantCallingMetrics SamToFastq
CollectWgsMetrics ScatterIntervalsByNs
CompareMetrics SortSam
CompareSAMs SortVcf
ConvertSequencingArtifactToOxoG SplitSamByLibrary
CreateSequenceDictionary SplitVcfs
DownsampleSam UpdateVcfSequenceDictionary
EstimateLibraryComplexity ValidateSamFile
ExtractIlluminaBarcodes VcfFormatConverter
ExtractSequences VcfToIntervalList
FastqToSam ViewSam
The package is enhanced by the following packages:
multiqc
Please cite:
Broad Institute:
Picard toolkit.
Broad Institute, GitHub repository
(2019)
Topics: Sequencing; Document, record and content management
|
|
pirs
Profile based Illumina pair-end Reads Simulator
|
Versions of package pirs |
Release | Version | Architectures |
bullseye | 2.0.2+dfsg-9 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
sid | 2.0.2+dfsg-12 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 2.0.2+dfsg-11 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 2.0.2+dfsg-8 | amd64,arm64,armhf,i386 |
stretch | 2.0.2+dfsg-5.1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
trixie | 2.0.2+dfsg-12 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
|
License: DFSG free
|
The program pIRS can be used for simulating Illumina PE reads, with a
series of characters generated by Illumina sequencing platform, such as
insert size distribution, sequencing error(substitution, insertion,
deletion), quality score and GC content-coverage bias.
The insert size follows a normal distribution, so users should set the
mean value and standard deviation. Usually the standard deviation is set
as 1/20 of the mean value. The normal distribution by Box-Muller method
is simulated.
The program simulates sequencing error, quality score and GC content-
coverage bias according to the empirical distribution profile. Some
default profiles counted from lots of real sequencing data are provided.
To simulate reads from diploid genome, users should simulate the diploid
genome sequence firstly by setting the ratio of heterozygosis SNP,
heterozygosis InDel and structure variation.
Please cite:
Xuesong Hu, Jianying Yuan, Yujian Shi, Jianliang Lu, Binghang Liu, Zhenyu Li, Yanxiang Chen, Desheng Mu, Hao Zhang, Nan Li, Zhen Yue, Fan Bai, Heng Li and Wei Fan:
pIRS: Profile-based Illumina pair-end reads simulator.
(PubMed,eprint)
Bioinformatics
28(11):1533-5
(2012)
|
|
pizzly
Identifies gene fusions in RNA sequencing data
|
Versions of package pizzly |
Release | Version | Architectures |
sid | 0.37.3+ds-9 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
bullseye | 0.37.3+ds-5 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
trixie | 0.37.3+ds-9 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
bookworm | 0.37.3+ds-9 | amd64,arm64,mips64el,ppc64el,s390x |
|
License: DFSG free
|
For the interpretation of the transcriptome (the abundance
and sequence of RNA) of tomour cells one is particularly
interested in transcripts that cannot be mapped to single
genes but that are seen to be fused as parts from two genes.
Likely eplanations are chromosomal translocations.
Pizzly can identify novel such peculiarities, building on
interpretations on variable splicing by the tool kallisto.
Both tools are elements of the bcbio workflow.
|
|
placnet
Plasmid Constellation Network project
|
Versions of package placnet |
Release | Version | Architectures |
bookworm | 1.04-1 | all |
buster | 1.03-3 | all |
stretch | 1.03-2 | all |
bullseye | 1.03-3 | all |
trixie | 1.04-1 | all |
sid | 1.04-1 | all |
|
License: DFSG free
|
Placnet is a new tool for plasmid analysis in NGS projects. Placnet is
optimized to work with Illumina sequences but it also works with 454,
Iontorrent or any of the actual sequence technologies.
The input of placnet is a set of contigs and one or more SAM files with
the mapping of the reads against the contigs. Placnet obtains a set of
files, easily opened on Cytoscape software or other network tools.
|
|
poretools
toolkit for nanopore nucleotide sequencing data
|
Versions of package poretools |
Release | Version | Architectures |
buster | 0.6.0+dfsg-3 | all |
bookworm | 0.6.0+dfsg-6 | all |
bullseye | 0.6.0+dfsg-5 | all |
trixie | 0.6.0+dfsg-7 | all |
sid | 0.6.0+dfsg-7 | all |
stretch | 0.6.0+dfsg-2 | all |
|
License: DFSG free
|
poretools is a flexible toolkit for exploring datasets generated by nanopore
sequencing devices from MinION for the purposes of quality control and
downstream analysis. Poretools operates directly on the native FAST5 (a
variant of the HDF5 standard) file format produced by ONT and provides a
wealth of format conversion utilities and data exploration and visualization
tools.
|
|
python3-airr
Data Representation Standard library for antibody and TCR sequences
|
Versions of package python3-airr |
Release | Version | Architectures |
sid | 1.5.0-1 | all |
bullseye | 1.3.1-1 | all |
bookworm | 1.3.1-1 | all |
buster | 1.2.1-2 | all |
trixie | 1.5.0-1 | all |
upstream | 1.5.1 |
|
License: DFSG free
|
This package provides a library by the AIRR community to for describing,
reporting, storing, and sharing adaptive immune receptor repertoire
(AIRR) data, such as sequences of antibodies and T cell receptors
(TCRs). Some specific efforts include:
- The MiAIRR standard for describing minimal information about AIRR
datasets, including sample collection and data processing information.
- Data representations (file format) specifications for storing large
amounts of annotated AIRR data.
- APIs for exposing a common interface to repositories/databases
containing AIRR data.
- A community standard for software tools which will allow conforming
tools to gain community recognition.
This package installs the library for Python 3.
|
|
python3-gffutils
Work with GFF and GTF files in a flexible database framework
|
Versions of package python3-gffutils |
Release | Version | Architectures |
sid | 0.13-1 | all |
bookworm | 0.11.1-3 | all |
buster | 0.9-1 | all |
trixie | 0.13-1 | all |
bullseye | 0.10.1-2 | all |
|
License: DFSG free
|
A Python package for working with and manipulating the GFF and GTF format
files typically used for genomic annotations. Files are loaded into a
sqlite3 database, allowing much more complex manipulation of hierarchical
features (e.g., genes, transcripts, and exons) than is possible with
plain-text methods alone.
|
|
python3-presto
toolkit for processing B and T cell sequences (Python3 module)
|
Versions of package python3-presto |
Release | Version | Architectures |
trixie | 0.7.2-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
sid | 0.7.2-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
buster | 0.5.10-1 | all |
bullseye | 0.6.2-1 | all |
bookworm | 0.7.1-1 | all |
|
License: DFSG free
|
pRESTO is a toolkit for processing raw reads from high-throughput
sequencing of B cell and T cell repertoires.
Dramatic improvements in high-throughput sequencing technologies now
enable large-scale characterization of lymphocyte repertoires, defined
as the collection of trans-membrane antigen-receptor proteins located on
the surface of B cells and T cells. The REpertoire Sequencing TOolkit
(pRESTO) is composed of a suite of utilities to handle all stages
of sequence processing prior to germline segment assignment. pRESTO
is designed to handle either single reads or paired-end reads. It
includes features for quality control, primer masking, annotation of
reads with sequence embedded barcodes, generation of unique molecular
identifier (UMI) consensus sequences, assembly of paired-end reads and
identification of duplicate sequences. Numerous options for sequence
sorting, sampling and conversion operations are also included.
This package provides the presto Python3 module.
|
|
python3-pybedtools
Python 3 wrapper around BEDTools for bioinformatics work
|
Versions of package python3-pybedtools |
Release | Version | Architectures |
sid | 0.10.0-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64 |
buster | 0.8.0-1 | amd64,arm64 |
trixie | 0.10.0-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64 |
bookworm | 0.9.0-4 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el |
bullseye | 0.8.0-5 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el |
|
License: DFSG free
|
The BEDTools suite of programs is widely used for genomic interval
manipulation or “genome algebra”. pybedtools wraps and extends BEDTools and
offers feature-level manipulations from within Python.
This is the Python 3 version.
|
|
python3-sqt
SeQuencing Tools for biological DNA/RNA high-throughput data
|
Versions of package python3-sqt |
Release | Version | Architectures |
sid | 0.8.0-8 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64 |
trixie | 0.8.0-8 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64 |
bookworm | 0.8.0-6 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el |
bullseye | 0.8.0-4 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el |
buster | 0.8.0-3 | amd64,arm64 |
|
License: DFSG free
|
sqt is a collection of command-line tools for working with
high-throughput sequencing data. Conceptionally not fixed to use any
particular language, many sqt subcommands are currently implemented
in Python. For them, a Python package is available with functions for
reading and writing FASTA/FASTQ files, computing alignments, quality
trimming, etc.
The following tools are offered:
- sqt-coverage -- Compute per-reference statistics such as coverage
and GC content
- sqt-fastqmod -- FASTQ modifications: shorten, subset, reverse
complement, quality trimming.
- sqt-fastastats -- Compute N50, min/max length, GC content etc. of
a FASTA file
- sqt-qualityguess -- Guess quality encoding of one or more FASTA files.
- sqt-globalalign -- Compute a global or semiglobal alignment of two strings.
- sqt-chars -- Count length of the first word given on the command line.
- sqt-sam-cscq -- Add the CS and CQ tags to a SAM file with colorspace reads.
- sqt-fastamutate -- Add substitutions and indels to sequences in a
FASTA file.
- sqt-fastaextract -- Efficiently extract one or more regions from an
indexed FASTA file.
- sqt-translate -- Replace characters in FASTA files (like the 'tr'
command).
- sqt-sam-fixn -- Replace all non-ACGT characters within reads in a
SAM file.
- sqt-sam-insertsize -- Mean and standard deviation of paired-end
insert sizes.
- sqt-sam-set-op -- Set operations (union, intersection, ...) on
SAM/BAM files.
- sqt-bam-eof -- Check for the End-Of-File marker in compressed
BAM files.
- sqt-checkfastqpe -- Check whether two FASTQ files contain correctly
paired paired-end data.
|
|
q2cli
Click-based command line interface for QIIME 2
|
Versions of package q2cli |
Release | Version | Architectures |
sid | 2024.5.0-2 | all |
bullseye | 2020.11.1-1 | all |
bookworm | 2022.11.1-2 | all |
upstream | 2024.10.1 |
|
License: DFSG free
|
QIIME 2 is a powerful, extensible, and decentralized microbiome analysis
package with a focus on data and analysis transparency. QIIME 2 enables
researchers to start an analysis with raw DNA sequence data and finish with
publication-quality figures and statistical results.
Key features:
- Integrated and automatic tracking of data provenance
- Semantic type system
- Plugin system for extending microbiome analysis functionality
- Support for multiple types of user interfaces (e.g. API, command line,
graphical)
QIIME 2 is a complete redesign and rewrite of the QIIME 1 microbiome analysis
pipeline. QIIME 2 will address many of the limitations of QIIME 1, while
retaining the features that makes QIIME 1 a powerful and widely-used analysis
pipeline.
QIIME 2 currently supports an initial end-to-end microbiome analysis pipeline.
New functionality will regularly become available through QIIME 2 plugins. You
can view a list of plugins that are currently available on the QIIME 2 plugin
availability page. The future plugins page lists plugins that are being
developed.
Please cite:
Evan Bolyen, Jai Ram Rideout, Matthew R Dillon, Nicholas A Bokulich, Christian Abnet, Gabriel A Al-Ghalith, Harriet Alexander, Eric J Alm, Manimozhiyan Arumugam, Francesco Asnicar, Yang Bai, Jordan E Bisanz, Kyle Bittinger, Asker Brejnrod, Colin J Brislawn, C Titus Brown, Benjamin J Callahan, Andrés Mauricio Caraballo-Rodríguez, John Chase, Emily Cope, Ricardo Da Silva, Pieter C Dorrestein, Gavin M Douglas, Daniel M Durall, Claire Duvallet, Christian F Edwardson, Madeleine Ernst, Mehrbod Estaki, Jennifer Fouquier, Julia M Gauglitz, Deanna L Gibson, Antonio Gonzalez, Kestrel Gorlick, Jiarong Guo, Benjamin Hillmann, Susan Holmes, Hannes Holste, Curtis Huttenhower, Gavin Huttley, Stefan Janssen, Alan K Jarmusch, Lingjing Jiang, Benjamin Kaehler, Kyo Bin Kang, Christopher R Keefe, Paul Keim, Scott T Kelley, Dan Knights, Irina Koester, Tomasz Kosciolek, Jorden Kreps, Morgan GI Langille, Joslynn Lee, Ruth Ley, Yong-Xin Liu, Erikka Loftfield, Catherine Lozupone, Massoud Maher, Clarisse Marotz, Bryan D Martin, Daniel McDonald, Lauren J McIver, Alexey V Melnik, Jessica L Metcalf, Sydney C Morgan, Jamie Morton, Ahmad Turan Naimey, Jose A Navas-Molina, Louis Felix Nothias, Stephanie B Orchanian, Talima Pearson, Samuel L Peoples, Daniel Petras, Mary Lai Preuss, Elmar Pruesse, Lasse Buur Rasmussen, Adam Rivers, Michael S Robeson, Patrick Rosenthal, Nicola Segata, Michael Shaffer, Arron Shiffer, Rashmi Sinha, Se Jin Song, John R Spear, Austin D Swafford, Luke R Thompson, Pedro J Torres, Pauline Trinh, Anupriya Tripathi, Peter J Turnbaugh, Sabah Ul-Hasan, Justin JJ van der Hooft, Fernando Vargas, Yoshiki Vázquez-Baeza, Emily Vogtmann, Max von Hippel, William Walters, Yunhu Wan, Mingxun Wang, Jonathan Warren, Kyle C Weber, Chase HD Williamson, Amy D Willis, Zhenjiang Zech Xu, Jesse R Zaneveld, Yilong Zhang, Qiyun Zhu, Rob Knight and J Gregory Caporaso:
Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2.
(eprint)
Nature Biotechnology
37
(2019)
|
|
qcumber
quality control of genomic sequences
|
Versions of package qcumber |
Release | Version | Architectures |
trixie | 2.3.0-2 | all |
sid | 2.3.0-2 | all |
bullseye | 2.3.0-2 | all |
buster | 1.0.14+dfsg-1 | all |
bookworm | 2.3.0-2 | all |
|
License: DFSG free
|
QCPipeline is a tool for quality control. The workflow is as follows:
1. Quality control with FastQC
2. Trim Reads with Trimmomatic
3. Quality control of trimmed reads with FastQC
4. Map reads against reference using bowtie2
5. Classify reads with Kraken
|
|
qiime
Quantitative Insights Into Microbial Ecology
|
Versions of package qiime |
Release | Version | Architectures |
sid | 2024.5.0-1 | all |
bookworm | 2022.11.1-2 | all |
bullseye | 2020.11.1-1 | all |
jessie | 1.8.0+dfsg-4 | amd64,armel,armhf,i386 |
upstream | 2024.10.1 |
Debtags of package qiime: |
role | program |
|
License: DFSG free
|
Microbes are surrounding us, animals, plants and all their parasites with
strong effect on these and the environment these live in. Soil quality comes
to mind but also the effect that bacteria have on each other. Humans are
influencing the absolute and relative abundance of bacteria by antibiotics,
food, fertilizers - you name it - and these changes affect us.
QIIME 2 is a powerful, extensible, and decentralized microbiome analysis
package with a focus on data and analysis transparency. QIIME 2 enables
researchers to start an analysis with raw DNA sequence data and finish with
publication-quality figures and statistical results.
Key features:
- Integrated and automatic tracking of data provenance
- Semantic type system
- Plugin system for extending microbiome analysis functionality
- Support for multiple types of user interfaces (e.g. API, command line,
graphical)
QIIME 2 is a complete redesign and rewrite of the QIIME 1 microbiome analysis
pipeline. QIIME 2 will address many of the limitations of QIIME 1, while
retaining the features that makes QIIME 1 a powerful and widely-used analysis
pipeline.
QIIME 2 currently supports an initial end-to-end microbiome analysis pipeline.
New functionality will regularly become available through QIIME 2 plugins. You
can view a list of plugins that are currently available on the QIIME 2 plugin
availability page. The future plugins page lists plugins that are being
developed.
Please cite:
Evan Bolyen, Jai Ram Rideout, Matthew R Dillon, Nicholas A Bokulich, Christian Abnet, Gabriel A Al-Ghalith, Harriet Alexander, Eric J Alm, Manimozhiyan Arumugam, Francesco Asnicar, Yang Bai, Jordan E Bisanz, Kyle Bittinger, Asker Brejnrod, Colin J Brislawn, C Titus Brown, Benjamin J Callahan, Andrés Mauricio Caraballo-Rodríguez, John Chase, Emily Cope, Ricardo Da Silva, Pieter C Dorrestein, Gavin M Douglas, Daniel M Durall, Claire Duvallet, Christian F Edwardson, Madeleine Ernst, Mehrbod Estaki, Jennifer Fouquier, Julia M Gauglitz, Deanna L Gibson, Antonio Gonzalez, Kestrel Gorlick, Jiarong Guo, Benjamin Hillmann, Susan Holmes, Hannes Holste, Curtis Huttenhower, Gavin Huttley, Stefan Janssen, Alan K Jarmusch, Lingjing Jiang, Benjamin Kaehler, Kyo Bin Kang, Christopher R Keefe, Paul Keim, Scott T Kelley, Dan Knights, Irina Koester, Tomasz Kosciolek, Jorden Kreps, Morgan GI Langille, Joslynn Lee, Ruth Ley, Yong-Xin Liu, Erikka Loftfield, Catherine Lozupone, Massoud Maher, Clarisse Marotz, Bryan D Martin, Daniel McDonald, Lauren J McIver, Alexey V Melnik, Jessica L Metcalf, Sydney C Morgan, Jamie Morton, Ahmad Turan Naimey, Jose A Navas-Molina, Louis Felix Nothias, Stephanie B Orchanian, Talima Pearson, Samuel L Peoples, Daniel Petras, Mary Lai Preuss, Elmar Pruesse, Lasse Buur Rasmussen, Adam Rivers, Michael S Robeson, Patrick Rosenthal, Nicola Segata, Michael Shaffer, Arron Shiffer, Rashmi Sinha, Se Jin Song, John R Spear, Austin D Swafford, Luke R Thompson, Pedro J Torres, Pauline Trinh, Anupriya Tripathi, Peter J Turnbaugh, Sabah Ul-Hasan, Justin JJ van der Hooft, Fernando Vargas, Yoshiki Vázquez-Baeza, Emily Vogtmann, Max von Hippel, William Walters, Yunhu Wan, Mingxun Wang, Jonathan Warren, Kyle C Weber, Chase HD Williamson, Amy D Willis, Zhenjiang Zech Xu, Jesse R Zaneveld, Yilong Zhang, Qiyun Zhu, Rob Knight and J Gregory Caporaso:
Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2.
(PubMed,eprint)
Nature Biotechnology
37:852 - 857
(2019)
Topics: Microbial ecology
|
|
quorum
QUality Optimized Reads of genomic sequences
|
Versions of package quorum |
Release | Version | Architectures |
bookworm | 1.1.1-7 | amd64,arm64,mips64el,ppc64el |
sid | 1.1.2-2 | amd64,arm64,mips64el,ppc64el,riscv64 |
buster | 1.1.1-2 | amd64,arm64 |
trixie | 1.1.2-2 | amd64,arm64,mips64el,ppc64el,riscv64 |
bullseye | 1.1.1-4 | amd64,arm64,mips64el,ppc64el |
|
License: DFSG free
|
QuorUM enables to obtain trimmed and error-corrected reads that result
in assemblies with longer contigs and fewer errors. QuorUM provides best
performance compared to other published error correctors in several
metrics. QuorUM is efficiently implemented making use of current multi-
core computing architectures and it is suitable for large data sets (1
billion bases checked and corrected per day per core). The third-party
assembler (SOAPdenovo) benefits significantly from using QuorUM error-
corrected reads. QuorUM error corrected reads result in a factor of 1.1
to 4 improvement in N50 contig size compared to using the original reads
with SOAPdenovo for the data sets investigated.
|
|
r-bioc-deseq2
R package for RNA-Seq Differential Expression Analysis
|
Versions of package r-bioc-deseq2 |
Release | Version | Architectures |
sid | 1.44.0+dfsg-1 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
bullseye | 1.30.1+dfsg-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bookworm | 1.38.3+dfsg-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 1.22.2+dfsg-1 | amd64,arm64,armhf,i386 |
trixie | 1.44.0+dfsg-1 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
stretch | 1.14.1-1 | amd64,arm64,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
upstream | 1.46.0 |
|
License: DFSG free
|
Differential gene expression analysis based on the negative binomial
distribution. Estimate variance-mean dependence in count data from
high-throughput sequencing assays and test for differential expression based
on a model using the negative binomial distribution.
|
|
r-bioc-edger
Empirical analysis of digital gene expression data in R
|
Versions of package r-bioc-edger |
Release | Version | Architectures |
experimental | 4.4.0+dfsg-1 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
jessie | 3.8.2+dfsg-1 | amd64,armel,armhf,i386 |
sid | 4.2.2+dfsg-1 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
trixie | 4.2.2+dfsg-1 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
bookworm | 3.40.2+dfsg-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bullseye | 3.32.1+dfsg-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
stretch | 3.14.0+dfsg-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
upstream | 4.4.0 |
|
License: DFSG free
|
Bioconductor package for differential expression analysis of whole
transcriptome sequencing (RNA-seq) and digital gene expression
profiles with biological replication. It uses empirical Bayes
estimation and exact tests based on the negative binomial
distribution. It is also useful for differential signal analysis with
other types of genome-scale count data.
|
|
r-bioc-hilbertvis
GNU R package to visualise long vector data
|
Versions of package r-bioc-hilbertvis |
Release | Version | Architectures |
jessie | 1.24.0-1 | amd64,armel,armhf,i386 |
stretch | 1.32.0-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
buster | 1.40.0-1 | amd64,arm64,armhf,i386 |
bullseye | 1.48.0-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bookworm | 1.56.0-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
trixie | 1.62.0-1 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
sid | 1.62.0-1 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
experimental | 1.64.0-1 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
upstream | 1.64.0 |
Debtags of package r-bioc-hilbertvis: |
biology | nuceleic-acids |
field | biology, biology:bioinformatics |
use | analysing |
|
License: DFSG free
|
This tool allows one to display very long data vectors in a space-efficient
manner, by organising it along a 2D Hilbert curve. The user can then
visually judge the large scale structure and distribution of features
simultaenously with the rough shape and intensity of individual features.
In bioinformatics, a typical use case is ChIP-Chip and ChIP-Seq,
or basically all the kinds of genomic data, that are conventionally
displayed as quantitative track ("wiggle data") in genome browsers such
as those provided by Ensembl or UCSC.
|
|
r-bioc-metagenomeseq
GNU R statistical analysis for sparse high-throughput sequencing
|
Versions of package r-bioc-metagenomeseq |
Release | Version | Architectures |
bookworm | 1.40.0-1 | all |
stretch | 1.16.0-2 | all |
experimental | 1.46.0-2 | all |
trixie | 1.46.0-1 | all |
sid | 1.46.0-1 | all |
buster | 1.24.1-1 | all |
bullseye | 1.32.0-1 | all |
|
License: DFSG free
|
MetagenomeSeq is designed to determine features (be it Operational
Taxanomic Unit (OTU), species, etc.) that are differentially abundant
between two or more groups of multiple samples. metagenomeSeq is
designed to address the effects of both normalization and under-sampling
of microbial communities on disease association detection and the
testing of feature correlations.
|
|
r-bioc-rsubread
Subread Sequence Alignment and Counting for R
|
Versions of package r-bioc-rsubread |
Release | Version | Architectures |
bullseye | 2.4.2-1 | amd64,arm64,mips64el,ppc64el,s390x |
trixie | 2.18.0-1 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
sid | 2.18.0-1 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
bookworm | 2.12.2-1 | amd64,arm64,mips64el,ppc64el,s390x |
upstream | 2.20.0 |
|
License: DFSG free
|
Alignment, quantification and analysis of second and third generation
sequencing data. Includes functionality for read mapping, read counting,
SNP calling, structural variant detection and gene fusion discovery.
Can be applied to all major sequencing techologies and to both short
and long sequence reads.
|
|
r-cran-alakazam
Immunoglobulin Clonal Lineage and Diversity Analysis
|
Versions of package r-cran-alakazam |
Release | Version | Architectures |
bookworm | 1.2.1-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
sid | 1.3.0-1 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
experimental | 1.3.0-2~0exp0 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
buster | 0.2.11-1 | amd64,arm64,armhf,i386 |
trixie | 1.3.0-1 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
bullseye | 1.1.0-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
|
License: DFSG free
|
Alakazam is part of the Immcantation analysis framework for Adaptive
Immune Receptor Repertoire sequencing (AIRR-seq) and provides a set of
tools to investigate lymphocyte receptor clonal lineages, diversity,
gene usage, and other repertoire level properties, with a focus on
high-throughput immunoglobulin (Ig) sequencing.
Alakazam serves five main purposes:
- Providing core functionality for other R packages in the Immcantation
framework. This includes common tasks such as file I/O, basic DNA
sequence manipulation, and interacting with V(D)J segment and gene
annotations.
- Providing an R interface for interacting with the output of the
pRESTO and Change-O tool suites.
- Performing lineage reconstruction on clonal populations of Ig
sequences and analyzing the topology of the resultant lineage trees.
- Performing clonal abundance and diversity analysis on lymphocyte
repertoires.
- Performing physicochemical property analyses of lymphocyte receptor
sequences.
|
|
r-cran-shazam
Immunoglobulin Somatic Hypermutation Analysis
|
Versions of package r-cran-shazam |
Release | Version | Architectures |
sid | 1.2.0-1 | all |
buster | 0.1.11-1 | all |
bullseye | 1.0.2-1 | all |
bookworm | 1.1.2-1 | all |
trixie | 1.2.0-1 | all |
|
License: DFSG free
|
Provides a computational framework for Bayesian estimation of
antigen-driven selection in immunoglobulin (Ig) sequences, providing an
intuitive means of analyzing selection by quantifying the degree of
selective pressure. Also provides tools to profile mutations in Ig
sequences, build models of somatic hypermutation (SHM) in Ig sequences,
and make model-dependent distance comparisons of Ig repertoires.
SHazaM is part of the Immcantation analysis framework for Adaptive
Immune Receptor Repertoire sequencing (AIRR-seq) and provides tools for
advanced analysis of somatic hypermutation (SHM) in immunoglobulin (Ig)
sequences. Shazam focuses on the following analysis topics:
- Quantification of mutational load
SHazaM includes methods for determine the rate of observed and
expected mutations under various criteria. Mutational profiling
criteria include rates under SHM targeting models, mutations specific
to CDR and FWR regions, and physicochemical property dependent
substitution rates.
- Statistical models of SHM targeting patterns
Models of SHM may be divided into two independent components:
1) a mutability model that defines where mutations occur and
2) a nucleotide substitution model that defines the resulting mutation.
Collectively these two components define an SHM targeting
model. SHazaM provides empirically derived SHM 5-mer context mutation
models for both humans and mice, as well tools to build SHM targeting
models from data.
- Analysis of selection pressure using BASELINe
The Bayesian Estimation of Antigen-driven Selection in Ig Sequences
(BASELINe) method is a novel method for quantifying antigen-driven
selection in high-throughput Ig sequence data. BASELINe uses SHM
targeting models can be used to estimate the null distribution of
expected mutation frequencies, and provide measures of selection
pressure informed by known AID targeting biases.
- Model-dependent distance calculations
SHazaM provides methods to compute evolutionary distances between
sequences or set of sequences based on SHM targeting models. This
information is particularly useful in understanding and defining
clonal relationships.
|
|
r-cran-tcr
Advanced Data Analysis of Immune Receptor Repertoires
|
Versions of package r-cran-tcr |
Release | Version | Architectures |
bullseye | 2.3.2+ds-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
sid | 2.3.2+ds-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
trixie | 2.3.2+ds-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 2.3.2+ds-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 2.2.3-1 | amd64,arm64,armhf,i386 |
|
License: DFSG free
|
Cells of the immune system are the grand exception to the rule
that all cells of an individuum have (mostly exact) copies of the
same DNA. B cells (which produce antibodies) and T cells (which
communicate with cells) however have a section of their DNA with
genes of the groups V, D and J that are reorganised within the
genomic DNA to provide the flexibility to deal with yet unknown
pathogens.
This package provides a platform for the advanced analysis of T
cell receptor repertoire data and its visualisations.
Caveat: This package is soon to be replaced by
http://github.com/immunomind/immunarch which is not yet available
as a Debian package.
|
|
r-cran-tigger
Infers new Immunoglobulin alleles from Rep-Seq Data
|
Versions of package r-cran-tigger |
Release | Version | Architectures |
trixie | 1.1.0-1 | all |
buster | 0.3.1-1 | all |
sid | 1.1.0-1 | all |
bookworm | 1.0.1-1 | all |
bullseye | 1.0.0-1 | all |
|
License: DFSG free
|
Summary: Infers the V genotype of an individual from immunoglobulin (Ig)
repertoire-sequencing (Rep-Seq) data, including detection of any novel
alleles. This information is then used to correct existing V allele calls
from among the sample sequences.
High-throughput sequencing of B cell immunoglobulin receptors is
providing unprecedented insight into adaptive immunity. A key step in
analyzing these data involves assignment of the germline V, D and J gene
segment alleles that comprise each immunoglobulin sequence by matching
them against a database of known V(D)J alleles. However, this process
will fail for sequences that utilize previously undetected alleles,
whose frequency in the population is unclear.
TIgGER is a computational method that significantly improves V(D)J
allele assignments by first determining the complete set of gene segments
carried by an individual (including novel alleles) from V(D)J-rearrange
sequences. TIgGER can then infer a subject’s genotype from these
sequences, and use this genotype to correct the initial V(D)J allele
assignments.
The application of TIgGER continues to identify a surprisingly high
frequency of novel alleles in humans, highlighting the critical need
for this approach. TIgGER, however, can and has been used with data
from other species.
Core Abilities:
- Detecting novel alleles
- Inferring a subject’s genotype
- Correcting preliminary allele calls
Required Input
- A table of sequences from a single individual, with columns containing
the following:
- V(D)J-rearranged nucleotide sequence (in IMGT-gapped format)
- Preliminary V allele calls
- Preliminary J allele calls
- Length of the junction region
- Germline Ig sequences in IMGT-gapped fasta format (e.g., as those
downloaded from IMGT/GENE-DB)
The former can be created through the use of IMGT/HighV-QUEST and
Change-O.
|
|
rna-star
ultrafast universal RNA-seq aligner
|
Versions of package rna-star |
Release | Version | Architectures |
buster | 2.7.0a+dfsg-1 | amd64,arm64 |
stretch | 2.5.2b+dfsg-1 | amd64,arm64,mips64el,ppc64el |
bullseye | 2.7.8a+dfsg-2 | amd64,arm64,mips64el,ppc64el |
bookworm | 2.7.10b+dfsg-2 | amd64,arm64,mips64el,ppc64el |
trixie | 2.7.11b+dfsg-2 | amd64,arm64,mips64el,ppc64el,riscv64 |
sid | 2.7.11b+dfsg-2 | amd64,arm64,mips64el,ppc64el,riscv64 |
stretch-backports | 2.7.0a+dfsg-1~bpo9+1 | amd64,arm64,mips64el,ppc64el |
|
License: DFSG free
|
Spliced Transcripts Alignment to a Reference (STAR) software based on a
previously undescribed RNA-seq alignment algorithm that uses sequential
maximum mappable seed search in uncompressed suffix arrays followed by
seed clustering and stitching procedure. STAR outperforms other aligners
by a factor of >50 in mapping speed, aligning to the human genome 550
million 2 × 76 bp paired-end reads per hour on a modest 12-core server,
while at the same time improving alignment sensitivity and precision. In
addition to unbiased de novo detection of canonical junctions, STAR can
discover non-canonical splices and chimeric (fusion) transcripts, and is
also capable of mapping full-length RNA sequences. Using Roche 454
sequencing of reverse transcription polymerase chain reaction amplicons,
the authors experimentally validated 1960 novel intergenic splice
junctions with an 80-90% success rate, corroborating the high precision
of the STAR mapping strategy.
The package is enhanced by the following packages:
multiqc
Topics: Sequence analysis
|
|
rtax
Classification of sequence reads of 16S ribosomal RNA gene
|
Versions of package rtax |
Release | Version | Architectures |
bookworm | 0.984-8 | all |
stretch | 0.984-5 | all |
sid | 0.984-8 | all |
bullseye | 0.984-7 | all |
buster | 0.984-6 | all |
jessie | 0.984-2 | all |
trixie | 0.984-8 | all |
|
License: DFSG free
|
Short-read technologies for microbial community profiling are increasingly
popular, yet previous techniques for assigning taxonomy to paired-end reads
perform poorly. RTAX provides rapid taxonomic assignments of paired-end
reads using a consensus algorithm.
|
|
salmon
wicked-fast transcript quantification from RNA-seq data
|
Versions of package salmon |
Release | Version | Architectures |
sid | 1.10.2+ds1-1 | amd64,arm64 |
bookworm | 1.10.1+ds1-1 | amd64,arm64 |
stretch | 0.7.2+ds1-2 | amd64 |
trixie | 1.10.2+ds1-1 | amd64,arm64 |
bullseye | 1.4.0+ds1-1 | amd64,arm64 |
buster | 0.12.0+ds1-1 | amd64 |
upstream | 1.10.3 |
|
License: DFSG free
|
Salmon is a wicked-fast program to produce a highly-accurate, transcript-level
quantification estimates from RNA-seq data. Salmon achieves is accuracy and
speed via a number of different innovations, including the use of lightweight
alignments (accurate but fast-to-compute proxies for traditional read
alignments) and massively-parallel stochastic collapsed variational inference.
The result is a versatile tool that fits nicely into many different pipelines.
For example, you can choose to make use of the lightweight alignments by
providing Salmon with raw sequencing reads, or, if it is more convenient, you
can provide Salmon with regular alignments (e.g. computed with your favorite
aligner), and it will use the same wicked-fast, state-of-the-art inference
algorithm to estimate transcript-level abundances for your experiment.
The package is enhanced by the following packages:
multiqc
|
|
sambamba
tools for working with SAM/BAM data
|
Versions of package sambamba |
Release | Version | Architectures |
bookworm | 1.0+dfsg-1 | amd64,arm64 |
sid | 1.0.1+dfsg-2 | amd64,arm64,riscv64 |
bullseye | 0.8.0-1 | amd64,arm64 |
|
License: DFSG free
|
Sambamba positions itself as a performant alternative
to samtools and provides tools for
- Powerful filtering with sambamba view --filter
- Picard-like SAM header merging in the merge tool
- Optional for operations on whole BAMs
- Fast copying of a region to a new file with the slice tool
- Duplicate marking/removal, using the Picard criteria
|
|
samblaster
marks duplicates, extracts discordant/split reads
|
Versions of package samblaster |
Release | Version | Architectures |
sid | 0.1.26-4 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
trixie | 0.1.26-4 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 0.1.26-4 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 0.1.24-2 | amd64,arm64,armhf,i386 |
bullseye | 0.1.26-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
|
License: DFSG free
|
Current "next-generation" sequencing technologies cannot tell what
exact sequence they will be reading. They take what is available. And
if some sequences are read very often, then this needs some extra
biomedical thinking. The genome could for instance be duplicated.
samblaster is a fast and flexible program for marking duplicates in
read-id grouped paired-end SAM files. It can also optionally output
discordant read pairs and/or split read mappings to separate SAM files,
and/or unmapped/clipped reads to a separate FASTQ file. When marking
duplicates, samblaster will require approximately 20MB of memory per
1M read pairs.
The package is enhanced by the following packages:
multiqc
|
|
samtools
processing sequence alignments in SAM, BAM and CRAM formats
|
Versions of package samtools |
Release | Version | Architectures |
stretch | 1.3.1-3 | amd64,arm64,armel,i386,mips64el,mipsel,ppc64el |
trixie | 1.20-3 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
buster | 1.9-4 | amd64,arm64,armhf |
bookworm | 1.16.1-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
stretch-backports | 1.7-2~bpo9+1 | amd64,arm64,armel,armhf,mips,mips64el,mipsel,ppc64el,s390x |
bullseye | 1.11-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
experimental | 1.21-0+exp1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
sid | 1.20-3 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
jessie | 0.1.19-1 | amd64,armhf,i386 |
upstream | 1.21 |
Debtags of package samtools: |
field | biology |
interface | commandline |
network | client |
role | program |
scope | utility |
uitoolkit | ncurses |
use | analysing, calculating, filtering |
works-with | biological-sequence |
|
License: DFSG free
|
Samtools is a set of utilities that manipulate nucleotide sequence alignments
in the binary BAM format. It imports from and exports to the ascii SAM
(Sequence Alignment/Map) and CRAM formats, does sorting, merging and indexing,
and allows one to retrieve reads in any regions swiftly. It is designed to work
on a stream, and is able to open a BAM or CRAM (not SAM) file on a remote FTP
or HTTP server.
|
|
scoary
pangenome-wide association studies
|
Versions of package scoary |
Release | Version | Architectures |
bullseye | 1.6.16-2 | all |
stretch-backports | 1.6.16-1~bpo9+1 | all |
sid | 1.6.16-9 | all |
trixie | 1.6.16-9 | all |
bookworm | 1.6.16-5 | all |
buster | 1.6.16-1 | all |
|
License: DFSG free
|
Scoary is designed to take the gene_presence_absence.csv file from
Roary as well as a traits file created by the user and calculate the
associations between all genes in the accessory genome and the traits. It
reports a list of genes sorted by strength of association per trait.
|
|
scythe
Bayesian adaptor trimmer for sequencing reads
|
Versions of package scythe |
Release | Version | Architectures |
stretch | 0.994-4 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
bookworm | 0.994+git20141017.20d3cff-3 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
trixie | 0.994+git20141017.20d3cff-5 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
sid | 0.994+git20141017.20d3cff-5 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bullseye | 0.994+git20141017.20d3cff-3 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 0.994+git20141017.20d3cff-1 | amd64,arm64,armhf,i386 |
|
License: DFSG free
|
Scythe uses a Naive Bayesian approach to classify contaminant substrings in
sequence reads. It considers quality information, which can make it robust in
picking out 3'-end adapters, which often include poor quality bases.
|
|
seqprep
stripping adaptors and/or merging paired reads of DNA sequences with overlap
|
Versions of package seqprep |
Release | Version | Architectures |
sid | 1.3.2-9 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
trixie | 1.3.2-9 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 1.3.2-8 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bullseye | 1.3.2-5 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 1.3.2-3 | amd64,arm64,armhf,i386 |
stretch | 1.3.2-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
|
License: DFSG free
|
SeqPrep is a program to merge paired end Illumina reads that are overlapping
into a single longer read. It may also just be used for its adapter trimming
feature without doing any paired end overlap. When an adapter sequence is
present, that means that the two reads must overlap (in most cases) so they
are forcefully merged. When reads do not have adapter sequence they must be
treated with care when doing the merging, so a much more specific approach is
taken. The default parameters were chosen with specificity in mind, so that
they could be ran on libraries where very few reads are expected to overlap.
It is always safest though to save the overlapping procedure for libraries
where you have some prior knowledge that a significant portion of the reads
will have some overlap.
|
|
seqtk
Fast and lightweight tool for processing sequences in the FASTA or FASTQ format
|
Versions of package seqtk |
Release | Version | Architectures |
jessie | 1.0-1 | amd64,armel,armhf,i386 |
stretch | 1.2-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
sid | 1.4-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
trixie | 1.4-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
buster | 1.3-1 | amd64,arm64,armhf,i386 |
bullseye | 1.3-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bookworm | 1.3-4 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
|
License: DFSG free
|
Currently, seqtk supports quality based trimming with the phred
algorithm, converting fastq to fasta, reverse complementing sequences,
extracting or masking subsequences in regions given in a BED/name list
file, and more. It contains a subsampling module to sample exactly n
sequences or a fraction of sequences.
Seqtk supports both fasta and fastq input files, which can be
optionally gzip compressed.
|
|
sga
de novo genome assembler that uses string graphs
|
Versions of package sga |
Release | Version | Architectures |
stretch | 0.10.15-2 | amd64,arm64,mips64el,ppc64el |
bullseye | 0.10.15-5 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el |
trixie | 0.10.15-7 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64 |
buster | 0.10.15-4 | amd64,arm64 |
sid | 0.10.15-7 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64 |
bookworm | 0.10.15-7 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el |
|
License: DFSG free
|
The major goal of SGA is to be very memory efficient, which is achieved by
using a compressed representation of DNA sequence reads.
SGA is a de novo assembler for DNA sequence reads. It is based on Gene Myers'
string graph formulation of assembly and uses the FM-index/Burrows-Wheeler
transform to efficiently find overlaps between sequence reads.
|
|
sickle
windowed adaptive trimming tool for FASTQ files using quality
|
Versions of package sickle |
Release | Version | Architectures |
trixie | 1.33+git20150314.f3d6ae3-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
buster | 1.33+git20150314.f3d6ae3-1 | amd64,arm64,armhf,i386 |
bullseye | 1.33+git20150314.f3d6ae3-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
stretch | 1.33-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
sid | 1.33+git20150314.f3d6ae3-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 1.33+git20150314.f3d6ae3-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
|
License: DFSG free
|
Most modern sequencing technologies produce reads that have deteriorating
quality towards the 3'-end. Incorrectly called bases here negatively impact
assembles, mapping, and downstream bioinformatics analyses.
Sickle is a tool that uses sliding windows along with quality and length
thresholds to determine when quality is sufficiently low to trim the 3'-end
of reads. It will also discard reads based upon the length threshold. It takes
the quality values and slides a window across them whose length is 0.1 times
the length of the read. If this length is less than 1, then the window is set
to be equal to the length of the read. Otherwise, the window slides along the
quality values until the average quality in the window drops below the
threshold. At that point the algorithm determines where in the window the drop
occurs and cuts both the read and quality strings there. However, if the cut
point is less than the minimum length threshold, then the read is discarded
entirely.
Sickle supports four types of quality values: Illumina, Solexa, Phred, and
Sanger. Note that the Solexa quality setting is an approximation (the actual
conversion is a non-linear transformation). The end approximation is close.
Sickle also supports gzipped file inputs.
The package is enhanced by the following packages:
multiqc
|
|
smalt
Sequence Mapping and Alignment Tool
|
Versions of package smalt |
Release | Version | Architectures |
sid | 0.7.6-13 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
stretch | 0.7.6-6 | amd64,arm64,armel,i386,mips64el,mipsel,ppc64el |
bullseye | 0.7.6-9 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
jessie | 0.7.6-4 | amd64,armhf,i386 |
trixie | 0.7.6-13 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 0.7.6-12 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 0.7.6-8 | amd64,arm64,armhf |
|
License: DFSG free
|
SMALT efficiently aligns DNA sequencing reads with a reference genome.
Reads from a wide range of sequencing platforms, for example Illumina,
Roche-454, Ion Torrent, PacBio or ABI-Sanger, can be processed including
paired reads.
The software employs a perfect hash index of short words (< 20
nucleotides long), sampled at equidistant steps along the genomic
reference sequences.
For each read, potentially matching segments in the reference are
identified from seed matches in the index and subsequently aligned with
the read using a banded Smith-Waterman algorithm.
The best gapped alignments of each read is reported including a score
for the reliability of the best mapping. The user can adjust the
trade-off between sensitivity and speed by tuning the length and spacing
of the hashed words.
A mode for the detection of split (chimeric) reads is provided.
Multi-threaded program execution is supported.
|
|
smrtanalysis
software suite for single molecule, real-time sequencing
|
Versions of package smrtanalysis |
Release | Version | Architectures |
trixie | 0~20210112 | all |
sid | 0~20210112 | all |
bullseye | 0~20210111 | all |
stretch | 0~20161126 | all |
bookworm | 0~20210112 | all |
|
License: DFSG free
|
SMRT® Analysis is a powerful, open-source bioinformatics software suite
available for analysis of DNA sequencing data from Pacific Biosciences’
SMRT technology. Users can choose from a variety of analysis protocols that
utilize PacBio® and third-party tools. Analysis protocols include de novo
genome assembly, cDNA mapping, DNA base-modification detection, and
long-amplicon analysis to determine phased consensus sequences.
This is a metapackage that depends on the components of SMRT Analysis.
|
|
snap-aligner
Scalable Nucleotide Alignment Program
|
Versions of package snap-aligner |
Release | Version | Architectures |
bullseye | 1.0.0+dfsg-2 | amd64,arm64,mips64el,ppc64el |
sid | 2.0.3+dfsg-2 | amd64,arm64,mips64el,ppc64el,riscv64 |
trixie | 2.0.3+dfsg-2 | amd64,arm64,mips64el,ppc64el,riscv64 |
buster | 1.0~beta.18+dfsg-3 | amd64,arm64 |
stretch | 1.0~beta.18+dfsg-1 | amd64,arm64,mips64el,ppc64el |
bookworm | 2.0.2+dfsg-1 | amd64,arm64,mips64el,ppc64el |
|
License: DFSG free
|
SNAP is a new sequence aligner that is 3-20x faster and just as accurate as
existing tools like BWA-mem, Bowtie2 and Novoalign. It runs on commodity x86
processors, and supports a rich error model that lets it cheaply match reads
with more differences from the reference than other tools. This gives SNAP up
to 2x lower error rates than existing tools (in some cases) and lets it match
larger mutations that they may miss. SNAP also natively reads BAM, FASTQ, or
gzipped FASTQ, and natively writes SAM or BAM, with built-in sorting,
duplicate marking, and BAM indexing.
|
|
sniffles
structural variation caller using third-generation sequencing
|
Versions of package sniffles |
Release | Version | Architectures |
stretch | 1.0.2+ds-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
bookworm | 2.0.7-1 | all |
buster | 1.0.11+ds-1 | amd64,arm64,armhf,i386 |
bullseye | 1.0.12b+ds-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
sid | 2.2-1 | all |
trixie | 2.2-1 | all |
upstream | 2.5.2 |
|
License: DFSG free
|
Sniffles is a structural variation (SV) caller using third-generation
sequencing data such as those from Pacific Biosciences or Oxford
Nanopore platforms. It detects all types of SVs using evidence from
split-read alignments, high-mismatch regions, and coverage analysis.
|
|
snp-sites
Binary code for the package snp-sites
|
Versions of package snp-sites |
Release | Version | Architectures |
bookworm | 2.5.1-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bullseye | 2.5.1-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 2.4.1-1 | amd64,arm64,armhf,i386 |
stretch | 2.3.2-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
jessie | 1.5.0-1 | amd64,armel,armhf,i386 |
sid | 2.5.1-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
trixie | 2.5.1-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
|
License: DFSG free
|
This program finds single nucleotide polymorphism (SNP) sites from
multi-fasta alignment input files (which might be compressed). Its
output can be in various widely used formats (Multi Fasta Alignment,
Vcf, phylip).
The software has been developed at the Wellcome Trust Sanger Institute.
A Single Nucleotide - polymorphism (SNP, pronounced snip; plural snips)
is a DNA sequence variation occurring when a Single Nucleotide — A, T, C
or G — in the genome (or other shared sequence) differs between members
of a biological species or paired chromosomes. For example, two
sequenced DNA fragments from different individuals, AAGCCTA to AAGCTTA,
contain a difference in a single nucleotide. In this case there are two
alleles. Almost all common SNPs have only two alleles.
Topics: Genetic variation
|
|
snpomatic
fast, stringent short-read mapping software
|
Versions of package snpomatic |
Release | Version | Architectures |
bookworm | 1.0-6 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 1.0-4 | amd64,arm64,armhf,i386 |
bullseye | 1.0-5 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
stretch | 1.0-3 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
sid | 1.0-7 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
trixie | 1.0-7 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
|
License: DFSG free
|
High throughput sequencing technologies generate large amounts of short reads.
Mapping these to a reference sequence consumes large amounts of processing
time and memory, and read mapping errors can lead to noisy or incorrect
alignments.
SNP-o-matic is a fast, stringent short-read mapping software. It supports a
multitude of output types and formats, for uses in filtering reads, alignments,
sequence-based genotyping calls, assisted reassembly of contigs etc.
Please cite:
Heinrich Magnus Manske and Dominic P. Kwiatkowski:
SNP-o-matic.
(PubMed,eprint)
Bioinformatics
25(18):2434-2435
(2009)
Topics: Genetic variation; Mapping
|
|
soapdenovo
short-read assembly method to build de novo draft assembly
|
Versions of package soapdenovo |
Release | Version | Architectures |
sid | 1.05-6 | amd64 |
bullseye | 1.05-6 | amd64 |
jessie | 1.05-2 | amd64 |
trixie | 1.05-6 | amd64 |
bookworm | 1.05-6 | amd64 |
stretch | 1.05-3 | amd64 |
buster | 1.05-5 | amd64 |
|
License: DFSG free
|
SOAPdenovo is a novel short-read assembly method that can build a de novo draft
assembly for the human-sized genomes. The program is specially designed to
assemble Illumina GA short reads.
It creates new opportunities for building reference
sequences and carrying out accurate analyses of unexplored genomes in a cost
effective way.
This version is not maintained anymore, consider using soapdenovo2.
Please cite:
Ruiqiang Li, Hongmei Zhu, Jue Ruan, Wubin Qian, Xiaodong Fang, Zhongbin Shi, Yingrui Li, Shengting Li, Gao Shan, Karsten Kristiansen, Songgang Li, Huanming Yang, Jian Wang and Jun Wang:
De novo assembly of human genomes with massively parallel short read sequencing.
(PubMed,eprint)
Genome Research
20(2):265-72
(2009)
|
|
soapdenovo2
short-read assembly method to build de novo draft assembly
|
Versions of package soapdenovo2 |
Release | Version | Architectures |
jessie | 240+dfsg-2 | amd64 |
stretch | 240+dfsg1-2 | amd64 |
buster | 241+dfsg-3 | amd64 |
bullseye | 242+dfsg-1 | amd64 |
bookworm | 242+dfsg-3 | amd64 |
trixie | 242+dfsg-4 | amd64 |
sid | 242+dfsg-4 | amd64 |
|
License: DFSG free
|
SOAPdenovo is a novel short-read assembly method that can build a de novo draft
assembly for the human-sized genomes. The program is specially designed to
assemble Illumina GA short reads.
It creates new opportunities for building reference
sequences and carrying out accurate analyses of unexplored genomes in a cost
effective way.
Please cite:
Ruibang Luo, Binghang Liu, Yinlong Xie, Zhenyu Li, Weihua Huang, Jianying Yuan, Guangzhu He, Yanxiang Chen, Qi Pan, Yunjie Liu, Jingbo Tang, Gengxiong Wu, Hao Zhang, Yujian Shi, Yong Liu, Chang Yu, Bo Wang, Yao Lu, Changlei Han, David W Cheung, Siu-Ming Yiu, Shaoliang Peng, Zhu Xiaoqian, Guangming Liu, Xiangke Liao, Yingrui Li, Huanming Yang, Jian Wang, Tak-Wah Lam and Jun Wang:
SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.
Giga Science
1(1):18
(2012)
|
|
sortmerna
tool for filtering, mapping and OTU-picking NGS reads
|
Versions of package sortmerna |
Release | Version | Architectures |
bullseye | 2.1-5 | amd64,i386 |
stretch | 2.1-1 | amd64,i386 |
buster | 2.1-3 | amd64,i386 |
sid | 4.3.7-1 | amd64,i386 |
trixie | 4.3.7-1 | amd64,i386 |
bookworm | 4.3.6-2 | amd64,i386 |
|
License: DFSG free
|
SortMeRNA is a biological sequence analysis tool for filtering, mapping and
OTU-picking NGS reads. The core algorithm is based on approximate seeds and
allows for fast and sensitive analyses of nucleotide sequences. The main
application of SortMeRNA is filtering rRNA from metatranscriptomic data.
Additional applications include OTU-picking and taxonomy assignation available
through QIIME v1.9+ (http://qiime.org - v1.9.0-rc1).
SortMeRNA takes as input a file of reads (fasta or fastq format) and one or
multiple rRNA database file(s), and sorts apart rRNA and rejected reads into
two files specified by the user. Optionally, it can provide high quality local
alignments of rRNA reads against the rRNA database. SortMeRNA works with
Illumina, 454, Ion Torrent and PacBio data, and can produce SAM and
BLAST-like alignments.
The package is enhanced by the following packages:
multiqc
|
|
spades
genome assembler for single-cell and isolates data sets
|
Versions of package spades |
Release | Version | Architectures |
stretch-backports-sloppy | 3.13.1+dfsg-2~bpo9+1 | amd64 |
trixie | 3.15.5+dfsg-7 | amd64 |
sid | 3.15.5+dfsg-7 | amd64 |
experimental | 4.0.0+dfsg1-1 | amd64 |
bullseye | 3.13.1+dfsg-2 | amd64 |
bookworm | 3.15.5+dfsg-2 | amd64 |
stretch | 3.9.1+dfsg-1 | amd64 |
stretch-backports | 3.12.0+dfsg-1~bpo9+1 | amd64 |
buster | 3.13.0+dfsg2-2 | amd64 |
upstream | 4.0.0 |
|
License: DFSG free
|
The SPAdes – St. Petersburg genome assembler is intended for both
standard isolates and single-cell MDA bacteria assemblies. It works
with Illumina or IonTorrent reads and is capable of providing hybrid
assemblies using PacBio and Sanger reads. You can also provide
additional contigs that will be used as long reads.
This package provides the following additional pipelines:
- metaSPAdes – a pipeline for metagenomic data sets
- plasmidSPAdes – a pipeline for extracting and assembling plasmids
from WGS data sets
- metaplasmidSPAdes – a pipeline for extracting and assembling
plasmids from metagenomic data sets
- rnaSPAdes – a de novo transcriptome assembler from RNA-Seq data
- truSPAdes – a module for TruSeq barcode assembly
- biosyntheticSPAdes – a module for biosynthetic gene cluster
assembly with paired-end reads
SPAdes provides several stand-alone binaries with relatively simple
command-line interface: k-mer counting (spades-kmercounter), assembly
graph construction (spades-gbuilder) and long read to graph aligner
(spades-gmapper).
Please cite:
Anton Bankevich, Sergey Nurk, Dmitry Antipov, Alexey A. Gurevich, Mikhail Dvorkin, Alexander S. Kulikov, Valery M. Lesin, Sergey I. Nikolenko, Son Pham, Andrey D. Prjibelski, Alexey V. Pyshkin, Alexander V. Sirotkin, Nikolay Vyahhi, Glenn Tesler, Max A. Alekseyev and Pavel A. Pevzner:
SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing.
(PubMed,eprint)
Journal of Computational Biology
19(5):455-477
(2012)
|
|
sprai
single-pass sequencing read accuracy improver
|
Versions of package sprai |
Release | Version | Architectures |
buster | 0.9.9.23+dfsg-2 | amd64,arm64,armhf,i386 |
bookworm | 0.9.9.23+dfsg1-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
sid | 0.9.9.23+dfsg1-3 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bullseye | 0.9.9.23+dfsg1-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
trixie | 0.9.9.23+dfsg1-3 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
stretch | 0.9.9.22+dfsg-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
|
License: DFSG free
|
Sprai is a tool to correct sequencing errors in single-pass reads for
de novo assembly. It is originally designed for correcting sequencing
errors in single-molecule DNA sequencing reads, especially in Continuous
Long Reads (CLRs) generated by PacBio RS sequencers. The goal of Sprai is
not maximizing the accuracy of error-corrected reads. Instead, Sprai aims
at maximizing the continuity (i.e., N50 contig length) of assembled contigs
after error correction.
|
|
sra-toolkit
utilities for the NCBI Sequence Read Archive
|
Versions of package sra-toolkit |
Release | Version | Architectures |
trixie | 3.0.9+dfsg-7 | amd64,arm64 |
stretch | 2.8.1-2+dfsg-2 | amd64,i386 |
buster | 2.9.3+dfsg-1 | amd64 |
bullseye | 2.10.9+dfsg-2 | amd64 |
jessie | 2.3.5-2+dfsg-1 | amd64,i386 |
bookworm | 3.0.3+dfsg-6~deb12u1 | amd64,arm64 |
sid | 3.0.9+dfsg-7 | amd64,arm64 |
upstream | 3.1.1 |
|
License: DFSG free
|
Tools for reading the SRA archive, generally by converting individual runs
into some commonly used format such as fastq.
The textual dumpers "sra-dump" and "vdb-dump" are provided in this
release as an aid in visual inspection. It is likely that their
actual output formatting will be changed in the near future to a
stricter, more formalized representation[s]. PLEASE DO NOT RELY UPON
THE OUTPUT FORMAT SEEN IN THIS RELEASE.
Other tools distributed in this package are:
abi-dump, abi-load
align-info
bam-load
cache-mgr
cg-load
copycat
fasterq-dump
fastq-dump, fastq-load
helicos-load
illumina-dump, illumina-load
kar
kdbmeta
latf-load
pacbio-load
prefetch
rcexplain
remote-fuser
sff-dump, sff-load
sra-pileup, sra-sort, sra-stat, srapath
srf-load
test-sra
vdb-config, vdb-copy, vdb-decrypt, vdb-encrypt, vdb-get, vdb-lock,
vdb-passwd, vdb-unlock, vdb-validate
The "help" information will be improved in near future releases, and
the tool options will become standardized across the set. More documentation
will also be provided documentation on the NCBI web site.
Tool options may change in the next release. Version 1 tool options
will remain supported wherever possible in order to preserve
operation of any existing scripts.
Please cite:
Rasko Leinonen, Ruth Akhtar, Ewan Birney, James Bonfield, Lawrence Bower, Matt Corbett, Ying Cheng, Fehmi Demiralp, Nadeem Faruque, Neil Goodgame, Richard Gibson, Gemma Hoad, Christopher Hunter, Mikyung Jang, Steven Leonard, Quan Lin, Rodrigo Lopez, Michael Maguire, Hamish McWilliam, Sheila Plaister, Rajesh Radhakrishnan, Siamak Sobhany, Guy Slater, Petra Ten Hoopen, Franck Valentin, Robert Vaughan, Vadim Zalunin, Daniel Zerbino and Guy Cochrane:
Improvements to services at the European Nucleotide Archive.
(PubMed,eprint)
Nucleic Acids Research
38(Database issue):D39-45
(2010)
|
|
srst2
Short Read Sequence Typing for Bacterial Pathogens
|
Versions of package srst2 |
Release | Version | Architectures |
bookworm | 0.2.0-9 | amd64,arm64,mips64el,ppc64el |
trixie | 0.2.0-12 | amd64,arm64,mips64el,ppc64el,riscv64 |
stretch | 0.2.0-4 | amd64 |
buster | 0.2.0-6 | amd64 |
bullseye | 0.2.0-8 | amd64,arm64,mips64el,ppc64el |
sid | 0.2.0-12 | amd64,arm64,mips64el,ppc64el,riscv64 |
|
License: DFSG free
|
This program is designed to take Illumina sequence data, a MLST database
and/or a database of gene sequences (e.g. resistance genes, virulence
genes, etc) and report the presence of STs and/or reference genes.
|
|
ssake
genomics application for assembling millions of very short DNA sequences
|
Versions of package ssake |
Release | Version | Architectures |
trixie | 4.0.1-2 | all |
bookworm | 4.0.1-1 | all |
sid | 4.0.1-2 | all |
jessie | 3.8.2-1 | all |
stretch | 3.8.4-1 | all |
buster | 4.0-2 | all |
bullseye | 4.0-3 | all |
Debtags of package ssake: |
biology | nuceleic-acids |
field | biology |
interface | shell |
role | program |
scope | utility |
use | analysing |
|
License: DFSG free
|
The Short Sequence Assembly by K-mer search and 3′ read Extension
(SSAKE) is a genomics application for aggressively assembling
millions of short nucleotide sequences by progressively searching for
perfect 3′-most k-mers using a DNA prefix tree. SSAKE is designed to
help leverage the information from short sequences reads by
stringently clustering them into contigs that can be used to
characterize novel sequencing targets.
Topics: Sequence assembly
|
|
stacks
pipeline for building loci from short-read DNA sequences
|
Versions of package stacks |
Release | Version | Architectures |
bookworm | 2.62+dfsg-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bullseye | 2.55+dfsg-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
trixie | 2.68+dfsg-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
sid | 2.68+dfsg-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
buster | 2.2+dfsg-1 | amd64,arm64,armhf |
stretch | 1.44-2 | amd64,arm64,armel,i386,mips64el,mipsel,ppc64el |
|
License: DFSG free
|
Stacks is a software pipeline for building loci from short-read sequences,
such as those generated on the Illumina platform. Stacks was developed to work
with restriction enzyme-based data, such as RAD-seq, for the purpose of
building genetic maps and conducting population genomics and phylogeography.
Note that this package installs Stacks such that all commands must be run as:
$ stacks
The package is enhanced by the following packages:
multiqc
|
|
stringtie
assemble short RNAseq reads to transcripts
|
Versions of package stringtie |
Release | Version | Architectures |
sid | 2.2.1+ds-3 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bullseye | 2.1.4+ds-4 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
trixie | 2.2.1+ds-3 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 2.2.1+ds-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
upstream | 2.2.3 |
|
License: DFSG free
|
The abundance of transcripts in a human tissue sample
can be determined by RNA sequencing. The exact sequence
sampled may be random, depending on the technology used.
And it may be short, i.e. shorter than the transcript.
At some point, many shorter reads need to be assembled
to the model the complete transcripts.
StringTie knows how to assemble of RNA-Seq into potential
transcripts without the need of a reference genome and
provides a quantification also of the splice variants.
|
|
subread
toolkit for processing next-gen sequencing data
|
Versions of package subread |
Release | Version | Architectures |
buster-backports | 2.0.0+dfsg-1~bpo10+1 | amd64,arm64,armel,armhf,i386,ppc64el |
buster | 1.6.3+dfsg-1 | amd64,arm64,armhf,i386 |
trixie | 2.0.7+dfsg-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64 |
stretch | 1.5.1+dfsg-4 | amd64,arm64,armel,armhf,i386,ppc64el |
bookworm | 2.0.3+dfsg-1 | amd64,arm64,armel,armhf,i386,ppc64el |
bullseye | 2.0.1+dfsg-1 | amd64,arm64,armel,armhf,i386,ppc64el |
sid | 2.0.7+dfsg-1 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64 |
upstream | 2.0.8 |
|
License: DFSG free
|
Subread aligner can be used to align both gDNA-seq and RNA-seq reads.
Subjunc aligner was specified designed for the detection of exon-exon
junction. For the mapping of RNA-seq reads, Subread performs local
alignments and Subjunc performs global alignments.
|
|
sumaclust
fast and exact clustering of genomic sequences
|
Versions of package sumaclust |
Release | Version | Architectures |
bookworm | 1.0.36+ds-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
sid | 1.0.36+ds-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
buster | 1.0.31-2 | amd64,arm64,armhf,i386 |
stretch | 1.0.20-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
bullseye | 1.0.36+ds-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
|
License: DFSG free
|
With the development of next-generation sequencing, efficient tools are
needed to handle millions of sequences in reasonable amounts of time.
Sumaclust is a program developed by the LECA. Sumaclust aims to cluster
sequences in a way that is fast and exact at the same time. This tool
has been developed to be adapted to the type of data generated by DNA
metabarcoding, i.e. entirely sequenced, short markers. Sumaclust
clusters sequences using the same clustering algorithm as UCLUST and CD-
HIT. This algorithm is mainly useful to detect the 'erroneous' sequences
created during amplification and sequencing protocols, deriving from
'true' sequences.
|
|
sumatra
fast and exact comparison and clustering of sequences
|
Versions of package sumatra |
Release | Version | Architectures |
bullseye | 1.0.36+ds-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bookworm | 1.0.36+ds-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
stretch | 1.0.20-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
buster | 1.0.31-2 | amd64,arm64,armhf,i386 |
sid | 1.0.36+ds-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
|
License: DFSG free
|
With the development of next-generation sequencing, efficient tools are
needed to handle millions of sequences in reasonable amounts of time.
Sumatra is a program developed by the LECA. Sumatra aims to compare
sequences in a way that is fast and exact at the same time. This tool
has been developed to be adapted to the type of data generated by DNA
metabarcoding, i.e. entirely sequenced, short markers. Sumatra computes
the pairwise alignment scores from one dataset or between two datasets,
with the possibility to specify a similarity threshold under which pairs
of sequences that have a lower similarity are not reported. The output
can then go through a classification process with programs such as MCL
or MOTHUR.
|
|
tabix
generic indexer for TAB-delimited genome position files
|
Versions of package tabix |
Release | Version | Architectures |
jessie | 1.1-1 | amd64,armel,i386 |
experimental | 1.21+ds-0+exp2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 1.16+ds-3 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
stretch | 1.3.2-2 | amd64,arm64,armel,i386,mips64el,mipsel,ppc64el |
buster | 1.9-12~deb10u1 | amd64,arm64,armhf,i386 |
sid | 1.20+ds-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
trixie | 1.20+ds-2 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
stretch-backports | 1.7-2~bpo9+1 | amd64,arm64,armel,armhf,mips,mips64el,mipsel,ppc64el,s390x |
bullseye | 1.11-4 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
jessie | 0.2.6-2 | armhf |
upstream | 1.21 |
Debtags of package tabix: |
role | program |
works-with-format | html |
|
License: DFSG free
|
Tabix indexes files where some columns indicate sequence coordinates: name
(usually a chromosome), start and stop. The input data file must be position
sorted and compressed by bgzip (provided in this package), which has a gzip
like interface. After indexing, tabix is able to quickly retrieve data lines by
chromosomal coordinates. Fast data retrieval also works over network if an URI
is given as a file name.
This package is built from the HTSlib source, and provides the bgzip, htsfile,
and tabix tools.
|
|
transrate-tools
|
Versions of package transrate-tools |
Release | Version | Architectures |
buster | 1.0.0-2 | amd64,arm64,armhf,i386 |
sid | 1.0.0-5 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
trixie | 1.0.0-5 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 1.0.0-5 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bullseye | 1.0.0-3 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
stretch | 1.0.0-1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
|
License: DFSG free
|
Transrate is a library and command-line tool for quality assessment of de-novo
transcriptome assemblies.
This package provides command line tools used by transrate to process BAM
files.
|
|
trimmomatic
flexible read trimming tool for Illumina NGS data
|
Versions of package trimmomatic |
Release | Version | Architectures |
trixie | 0.39+dfsg-2 | all |
bullseye | 0.39+dfsg-2 | all |
bookworm | 0.39+dfsg-2 | all |
stretch | 0.36+dfsg-1 | all |
buster | 0.38+dfsg-1 | all |
jessie | 0.32+dfsg-4 | all |
sid | 0.39+dfsg-2 | all |
|
License: DFSG free
|
Trimmomatic performs a variety of useful trimming tasks for illumina
paired-end and single ended data.The selection of trimming steps and
their associated parameters are supplied on the command line.
The current trimming steps are:
- ILLUMINACLIP: Cut adapter and other illumina-specific sequences from
the read.
- SLIDINGWINDOW: Perform a sliding window trimming, cutting once thes
average quality within the window falls below a threshold.
- LEADING: Cut bases off the start of a read, if below a threshold quality
- TRAILING: Cut bases off the end of a read, if below a threshold quality
- CROP: Cut the read to a specified length
- HEADCROP: Cut the specified number of bases from the start of the read
- MINLENGTH: Drop the read if it is below a specified length
- TOPHRED33: Convert quality scores to Phred-33
- TOPHRED64: Convert quality scores to Phred-64
It works with FASTQ (using phred + 33 or phred + 64 quality scores,
depending on the Illumina pipeline used), either uncompressed or
gzipp'ed FASTQ. Use of gzip format is determined based on the .gz
extension.
The package is enhanced by the following packages:
multiqc
|
|
trinityrnaseq
|
Versions of package trinityrnaseq |
Release | Version | Architectures |
sid | 2.15.2+dfsg-1 | amd64,arm64,ppc64el,riscv64 |
stretch | 2.2.0+dfsg-2 | amd64 |
bullseye | 2.11.0+dfsg-6 | amd64,arm64 |
buster | 2.6.6+dfsg-6 | amd64 |
trixie | 2.15.2+dfsg-1 | amd64,arm64,ppc64el,riscv64 |
|
License: DFSG free
|
Trinity represents a novel method for the efficient and robust de novo
reconstruction of transcriptomes from RNA-seq data. Trinity combines three
independent software modules: Inchworm, Chrysalis, and Butterfly, applied
sequentially to process large volumes of RNA-seq reads. Trinity partitions
the sequence data into many individual de Bruijn graphs, each representing the
transcriptional complexity at a given gene or locus, and then processes
each graph independently to extract full-length splicing isoforms and to tease
apart transcripts derived from paralogous genes.
Please cite:
Manfred G Grabherr, Brian J Haas, Moran Yassour, Joshua Z Levin, Dawn A Thompson, Ido Amit, Xian Adiconis, Lin Fan, Raktima Raychowdhury, Qiandong Zeng, Zehua Chen, Evan Mauceli, Nir Hacohen, Andreas Gnirke, Nicholas Rhind, Federica di Palma, Bruce W Birren, Chad Nusbaum, Kerstin Lindblad-Toh, Nir Friedman and Aviv Regev:
Full-length transcriptome assembly from RNA-Seq data without a reference genome..
(PubMed)
Nature Biotechnology
29(7):644-652
(2011)
|
|
uc-echo
error correction algorithm designed for short-reads from NGS
|
Versions of package uc-echo |
Release | Version | Architectures |
stretch | 1.12-9 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
buster | 1.12-11 | amd64,arm64,armhf,i386 |
bookworm | 1.12-18 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bullseye | 1.12-15 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
trixie | 1.12-19 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
sid | 1.12-19 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
jessie | 1.12-7 | amd64,armel,armhf,i386 |
|
License: DFSG free
|
ECHO is an error correction algorithm designed for short-reads
from next-generation sequencing platforms such as Illumina's
Genome Analyzer II. The algorithm uses a Bayesian framework to
improve the quality of the reads in a given data set by employing
maximum a posteriori estimation.
Topics: Data management; Sequencing
|
|
vcftools
Collection of tools to work with VCF files
|
Versions of package vcftools |
Release | Version | Architectures |
jessie-security | 0.1.12+dfsg-1+deb8u1 | amd64,armel,armhf,i386 |
bookworm | 0.1.16-3 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 0.1.16-1 | amd64,arm64,armhf,i386 |
trixie | 0.1.16-3 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bullseye | 0.1.16-2 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
sid | 0.1.16-3 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
stretch | 0.1.14+dfsg-4+deb9u1 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
jessie | 0.1.12+dfsg-1 | amd64,armel,armhf,i386 |
Debtags of package vcftools: |
role | program |
|
License: DFSG free
|
VCFtools is a program package designed for working with VCF files, such as
those generated by the 1000 Genomes Project. The aim of VCFtools is to
provide methods for working with VCF files: validating, merging, comparing
and calculate some basic population genetic statistics.
The package is enhanced by the following packages:
multiqc
Please cite:
Petr Danecek, Adam Auton, Goncalo Abecasis, Cornelis A. Albers, Eric Banks, Mark A. DePristo, Robert E. Handsaker, Gerton Lunter, Gabor T. Marth, Stephen T. Sherry, Gilean McVean and Richard Durbin:
The variant call format and VCFtools.
(PubMed,eprint)
Bioinformatics
27(15):2156-8
(2011)
|
|
velvet
Nucleic acid sequence assembler for very short reads
|
Versions of package velvet |
Release | Version | Architectures |
trixie | 1.2.10+dfsg1-9 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 1.2.10+dfsg1-8 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
sid | 1.2.10+dfsg1-9 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bullseye | 1.2.10+dfsg1-7 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 1.2.10+dfsg1-5 | amd64,arm64,armhf,i386 |
stretch | 1.2.10+dfsg1-3 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
jessie | 1.2.10+dfsg1-1 | amd64,armel,armhf,i386 |
Debtags of package velvet: |
biology | nuceleic-acids |
field | biology, biology:bioinformatics |
interface | commandline |
role | program |
use | analysing |
|
License: DFSG free
|
Velvet is a de novo genomic assembler specially designed for short read
sequencing technologies, such as Solexa or 454, developed by Daniel Zerbino and
Ewan Birney at the European Bioinformatics Institute (EMBL-EBI), near
Cambridge, in the United Kingdom.
Velvet currently takes in short read sequences, removes errors then produces
high quality unique contigs. It then uses paired read information, if
available, to retrieve the repeated areas between contigs.
|
|
velvet-long
Nucleic acid sequence assembler for very short reads, long version
|
Versions of package velvet-long |
Release | Version | Architectures |
trixie | 1.2.10+dfsg1-9 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
buster | 1.2.10+dfsg1-5 | amd64,arm64,armhf,i386 |
jessie | 1.2.10+dfsg1-1 | amd64,armel,armhf,i386 |
bullseye | 1.2.10+dfsg1-7 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
bookworm | 1.2.10+dfsg1-8 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
stretch | 1.2.10+dfsg1-3 | amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x |
sid | 1.2.10+dfsg1-9 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
|
License: DFSG free
|
Velvet is a de novo genomic assembler specially designed for short read
sequencing technologies, such as Solexa or 454, developed by Daniel Zerbino and
Ewan Birney at the European Bioinformatics Institute (EMBL-EBI), near
Cambridge, in the United Kingdom.
Velvet currently takes in short read sequences, removes errors then produces
high quality unique contigs. It then uses paired read information, if
available, to retrieve the repeated areas between contigs.
This package installs special long-mode versions of Velvet, as recommended
in the Velvet tutorials.
|
|
velvetoptimiser
automatically optimise Velvet do novo assembly parameters
|
Versions of package velvetoptimiser |
Release | Version | Architectures |
trixie | 2.2.6-5 | all |
jessie | 2.2.5-2 | all |
stretch | 2.2.5-5 | all |
buster | 2.2.6-2 | all |
bullseye | 2.2.6-3 | all |
bookworm | 2.2.6-5 | all |
sid | 2.2.6-5 | all |
|
License: DFSG free
|
VelvetOptimiser is a multi-threaded Perl script for automatically optimising
the three primary parameter options (K, -exp_cov, -cov_cutoff) for the Velvet
de novo sequence assembler.
|
|
vsearch
tool for processing metagenomic sequences
|
Versions of package vsearch |
Release | Version | Architectures |
trixie | 2.29.1-1 | amd64,arm64,mips64el,ppc64el,riscv64 |
stretch | 2.3.4-1 | amd64 |
bookworm | 2.22.1-1 | amd64,arm64,ppc64el |
sid | 2.29.1-1 | amd64,arm64,mips64el,ppc64el,riscv64 |
buster | 2.10.4-1 | amd64 |
bullseye | 2.15.2-3 | amd64,arm64,ppc64el |
|
License: DFSG free
|
Versatile 64-bit multithreaded tool for processing metagenomic sequences,
including searching, clustering, chimera detection, dereplication, sorting,
masking and shuffling
The aim of this project is to create an alternative to the USEARCH tool
developed by Robert C. Edgar (2010). The new tool should:
- have a 64-bit design that handles very large databases and much more
than 4GB of memory
- be as accurate or more accurate than usearch
- be as fast or faster than usearch
|
|
wham-align
Wisconsin's High-Throughput Alignment Method
|
Versions of package wham-align |
Release | Version | Architectures |
sid | 0.1.5-8 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 0.1.5-8 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
trixie | 0.1.5-8 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bullseye | 0.1.5-8 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
|
License: DFSG free
|
This package provides functionality analogous to BWA or
bowtie in aligning reads from next-generation DNA sequencing
machines against a reference genome.
Please cite:
Yinan Li, Allie Terrell and Jignesh M. Patel:
WHAM: A High-throughput Sequence Alignment Method
(eprint)
Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece
(2011)
|
|
wigeon
reimplementation of the Pintail 16S DNA anomaly detection utility
|
Versions of package wigeon |
Release | Version | Architectures |
sid | 20101212+dfsg1-6 | all |
jessie | 20101212+dfsg-1 | all |
buster | 20101212+dfsg1-2 | all |
stretch | 20101212+dfsg1-1 | all |
bookworm | 20101212+dfsg1-5 | all |
trixie | 20101212+dfsg1-6 | all |
bullseye | 20101212+dfsg1-4 | all |
|
License: DFSG free
|
WigeoN examines the sequence conservation between a query and a trusted
reference sequence, both in NAST alignment format. Based on the sequence
identity between the query and the reference sequence, there is an
expected amount of variation among the alignment. If the observed
variation is greater than the 95% quantile of the distribution of
variation observed between non-anomalous sequences, then it is flagged
as an anomaly.
WigeoN is a flexible command-line based reimplementation of the Pintail
algorithm Appl Environ Microbiol. 2005 Dec;7112:7724-36.
WigeoN is useful for flagging chimeras and anomalies only in near
full-length 16S rRNA sequences. WigeoN lacks sensitivity with sequences
less than 1000 bp.
To run WigeoN, you need NAST-formatted sequences generated by the
nast-ier utility.
WigeoN is part of the microbiomeutil suite.
Please cite:
Brian J. Haas, Dirk Gevers, Ashlee M. Earl, Mike Feldgarden, Doyle V. Ward, Georgia Giannoukos, Dawn Ciulla, Diana Tabbaa, Sarah K. Highlander, Erica Sodergren, Barbara Methé, Todd Z. DeSantis, The Human Microbiome Consortium, Joseph F. Petrosino, Rob Knight and Bruce W. Birren:
Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons.
(PubMed,eprint)
Genome Research
21(3):494-504
(2011)
|
|
Official Debian packages with lower relevance
nanolyse
remove lambda phage reads from a fastq file
|
Versions of package nanolyse |
Release | Version | Architectures |
bullseye | 1.2.0-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
sid | 1.2.0-4 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bookworm | 1.2.0-4 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
trixie | 1.2.0-4 | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
|
License: DFSG free
|
NanoLyse is a tool for rapid removal of contaminant DNA, using the
Minimap2 aligner through the mappy Python binding. A typical application
would be the removal of the lambda phage control DNA fragment supplied
by ONT, for which the reference sequence is included in the package.
However, this approach may lead to unwanted loss of reads from regions
highly homologous to the lambda phage genome.
|
|
python3-anndata
annotated gene by sample numpy matrix
|
Versions of package python3-anndata |
Release | Version | Architectures |
sid | 0.10.6-1 | all |
bookworm | 0.8.0-4 | all |
bullseye | 0.7.5+ds-3 | all |
upstream | 0.11.1 |
|
License: DFSG free
|
AnnData provides a scalable way of keeping track of data together
with learned annotations. It is used within Scanpy, for which it was
initially developed. Both packages have been introduced in Genome
Biology (2018).
|
|
r-bioc-isoformswitchanalyzer
Identify, Annotate and Visualize Alternative Splicing and
|
Versions of package r-bioc-isoformswitchanalyzer |
Release | Version | Architectures |
sid | 2.4.0+ds-1 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
trixie | 2.4.0+ds-1 | amd64,arm64,mips64el,ppc64el,riscv64,s390x |
bookworm | 1.20.0+ds-1 | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
upstream | 2.6.0 |
|
License: DFSG free
|
Isoform Switches with Functional Consequences from both short- and
long-read RNA-seq data. Analysis of alternative splicing and
isoform switches with predicted functional consequences (e.g.
gain/loss of protein domains etc.) from quantification of all
types of RNASeq by tools such as Kallisto, Salmon, StringTie,
Cufflinks/Cuffdiff etc.
|
|
Debian packages in contrib or non-free
bcbio
toolkit for analysing high-throughput sequencing data
|
Versions of package bcbio |
Release | Version | Architectures |
bullseye | 1.2.5-1 (contrib) | all |
bookworm | 1.2.9-2 (contrib) | all |
buster | 1.1.2-3 | all |
sid | 1.2.9-2 (contrib) | all |
|
License: DFSG free, but needs non-free components
|
This package installs the command line tools of the bcbio-nextgen
toolkit implementing best-practice pipelines for fully automated high
throughput sequencing analysis.
A high-level configuration file specifies inputs and analysis parameters
to drive a parallel pipeline that handles distributed execution,
idempotent processing restarts and safe transactional steps. The project
contributes a shared community resource that handles the data processing
component of sequencing analysis, providing researchers with more time
to focus on the downstream biology.
This package builds and having it in Debian unstable helps the Debian
developers to synchronize their efforts. But unless a series of external
dependencies are not installed manually, the functionality of bcbio in
Debian is only a shadow of itself. Please use the official distribution
of bcbio for the time being, which means "use conda". The TODO file in
the Debian directory should give an overview on progress for Debian
packaging.
|
cufflinks
Transcript assembly, differential expression and regulation for RNA-Seq
|
Versions of package cufflinks |
Release | Version | Architectures |
bookworm | 2.2.1+dfsg.1-9 (non-free) | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
buster | 2.2.1+dfsg.1-3 (non-free) | amd64,arm64,armhf,i386 |
stretch | 2.2.1-3 (non-free) | amd64 |
jessie | 2.2.1-1 (non-free) | amd64 |
trixie | 2.2.1+dfsg.1-10 (non-free) | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
bullseye | 2.2.1+dfsg.1-8 (non-free) | amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x |
sid | 2.2.1+dfsg.1-10 (non-free) | amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x |
|
License: non-free
|
Cufflinks assembles transcripts, estimates their abundances, and
tests for differential expression and regulation in RNA-Seq samples.
It accepts aligned RNA-Seq reads and assembles the alignments into a
parsimonious set of transcripts. Cufflinks then estimates the
relative abundances of these transcripts based on how many reads
support each one.
This package provides the binary of cufflinks and associated tools, i.e.
compress_gtf, cuffcompare, cuffdiff, cuffmerge, cuffnorm, cuffquant and
gtf_to_sam.
|
vdjtools
framework for post-analysis of B/T cell repertoires
|
Versions of package vdjtools |
Release | Version | Architectures |
trixie | 1.2.1+git20190311+repack-2 (non-free) | all |
bookworm | 1.2.1+git20190311+repack-1 (non-free) | all |
bullseye | 1.2.1+git20190311-5 (non-free) | all |
sid | 1.2.1+git20190311+repack-2 (non-free) | all |
|
License: non-free
|
VDJtools is an open-source Java/Groovy-based framework designed
to facilitate analysis of immune repertoire sequencing (RepSeq)
data. VDJtools computes a wide set of statistics and is able to perform
various forms of cross-sample analysis. Both comprehensive tabular
output and publication-ready plots are provided.
The main aims of the VDJtools Project are:
- To ensure consistency between post-analysis methods and results
- To save the time of bioinformaticians analyzing RepSeq data
- To create an API framework facilitating development of new RepSeq
analysis applications
- To provide a simple enough command line tool so it could be used by
immunologists and biologists with little computational background
Please cite:
M Shugay, D.V. Bagaev, M.A. Turchaninova, D.A. Bolotin, O.V. Britanova, E.V. Putintseva, M.V. Pogorelyy, V.I. Nazarov VI, I.V. Zvyagin, V.I. Kirgizova, K.I. Kirgizov, E.V. Skorobogatova and D.M. Chudakov:
VDJtools: Unifying Post-analysis of T Cell Receptor Repertoires.
(PubMed,eprint)
PLoS Comput Biol.
11(11):e1004503
(2015)
|
Packaging has started and developers might try the packaging code in VCS
graphmap2
highly sensitive and accurate mapper for long, error-prone reads
|
Versions of package graphmap2 |
Release | Version | Architectures |
VCS | 0.6.4-1 | all |
|
License: MIT
Debian package not available
Version: 0.6.4-1
|
GraphMap2 is a highly sensitive and accurate mapper for long, error-
prone reads. The mapping algorithm is designed to analyse nanopore
sequencing reads, which progressively refines candidate alignments to
robustly handle potentially high-error rates and a fast graph traversal
to align long reads with speed and high precision (>95%). Evaluation on
MinION sequencing data sets against short- and long-read mappers
indicates that GraphMap increases mapping sensitivity by 10–80% and maps
95% of bases. GraphMap alignments enabled single-nucleotide variant
calling on the human genome with increased sensitivity (15%) over the
next best mapper, precise detection of structural variants from length
100 bp to 4 kbp, and species and strain-specific identification of
pathogens using MinION reads.
|
mosaik-aligner
reference-guided aligner for next-generation sequencing
|
Versions of package mosaik-aligner |
Release | Version | Architectures |
VCS | 2.2.30+20140627-1 | all |
|
License: MIT
Debian package not available
Version: 2.2.30+20140627-1
|
MosaikBuild converts various sequence formats into Mosaik’s native read
format. MosaikAligner pairwise aligns each read to a specified series of
reference sequences. MosaikSort resolves paired-end reads and sorts the
alignments by the reference sequence coordinates. Finally, MosaikText
converts alignments to different text-based formats.
At this time, the workflow consists of supplying sequences in FASTA,
FASTQ, Illumina Bustard & Gerald, or SRF file formats and producing
results in the BLAT axt, the BAM/SAM, the UCSC Genome Browser bed, or
the Illumina ELAND formats.
|
nanoplot
plotting scripts for long read sequencing data
|
Versions of package nanoplot |
Release | Version | Architectures |
VCS | 1.36.2-1 | all |
|
License: MIT
Debian package not available
Version: 1.36.2-1
|
NanoPlot provides plotting scripts for long read sequencing data.
These scripts perform data extraction from Oxford Nanopore sequencing data
in the following formats:
- fastq files (optionally compressed)
- fastq files generated by albacore, guppy or MinKNOW containing additional
information (optionally compressed)
- sorted bam files
- sequencing_summary.txt output table generated by albacore, guppy or
MinKnow basecalling (optionally compressed)
- fasta files (optionally compressed)
- multiple files of the same type can be offered simultaneously
|
r-bioc-mofa2
Multi-Omics Factor Analysis v2
|
Versions of package r-bioc-mofa2 |
Release | Version | Architectures |
VCS | 1.2.2+ds-1 | all |
|
License: GPL-2+
Debian package not available
Version: 1.2.2+ds-1
|
The MOFA2 package contains a collection of tools for training and
analysing multi-omic factor analysis (MOFA). MOFA is a probabilistic
factor model that aims to identify principal axes of variation from data
sets that can comprise multiple omic layers and/or groups of samples.
Additional time or space information on the samples can be incorporated
using the MEFISTO framework, which is part of MOFA2. Downstream analysis
functions to inspect molecular features underlying each factor,
vizualisation, imputation etc are available.
|
umap
quantify genome and methylome mappability
|
Versions of package umap |
Release | Version | Architectures |
VCS | 1.0.0-1 | all |
|
License: GPL-3.0
Debian package not available
Version: 1.0.0-1
|
Umap identifies uniquely mappable regions of any genome. Its Bismap
extension identifies mappability of the bisulfite converted
genome (methylome).
|
No known packages available
annovar
annotate genetic variants detected from diverse genomes
|
|
License: Open Source for non-profit
Debian package not available
|
ANNOVAR is an efficient software tool to utilize update-to-date information
to functionally annotate genetic variants detected from diverse genomes
(including human genome hg18, hg19, as well as mouse, worm, fly, yeast and
many others). Given a list of variants with chromosome, start position, end
position, reference nucleotide and observed nucleotides, ANNOVAR can perform:
1. Gene-based annotation: identify whether SNPs or CNVs cause protein coding
changes and the amino acids that are affected. Users can flexibly use RefSeq
genes, UCSC genes, ENSEMBL genes, GENCODE genes, or many other gene definition
systems.
2. Region-based annotations: identify variants in specific genomic regions,
for example, conserved regions among 44 species, predicted transcription
factor binding sites, segmental duplication regions, GWAS hits, database
of genomic variants, DNAse I hypersensitivity sites, ENCODE
H3K4Me1/H3K4Me3/H3K27Ac/CTCF sites, ChIP-Seq peaks, RNA-Seq peaks, or many
other annotations on genomic intervals.
3. Filter-based annotation: identify variants that are reported in dbSNP,
or identify the subset of common SNPs (MAF>1%) in the 1000 Genome Project,
or identify subset of non-synonymous SNPs with SIFT score>0.05, or many
other annotations on specific mutations.
4. Other functionalities: Retrieve the nucleotide sequence in any
user-specific genomic positions in batch, identify a candidate gene list
for Mendelian diseases from exome data, identify a list of SNPs from
1000 Genomes that are in strong LD with a GWAS hit, and many other
creative utilities.
In a modern desktop computer (3GHz Intel Xeon CPU, 8Gb memory), for
4.7 million variants, ANNOVAR requires ~4 minutes to perform
gene-based functional annotation, or ~15 minutes to perform stepwise
"variants reduction" procedure, making it practical to handle hundreds
of human genomes in a day.
|
forge
genome assembler for mixed read types
|
|
License: Apache 2.0
Debian package not available
|
Forge Genome Assembler is a parallel, MPI based genome assembler for
mixed read types.
Forge is a classic "Overlap layout consensus" genome assembler written
by Darren Platt and Dirk Evers. Implemented in C++ and using the
parallel MPI library, it runs on one or more machines in a network and
can scale to very large numbers of reads provided there is enough
collective memory on the machines used. It generates a full consensus
alignment of all reads, can handle mixtures of sanger, 454 and illumina
reads. There is some support for solid color space and it includes built
in tools for vector trimming and contamination screening.
Forge and was originally developed at Exelixis and they have kindly
agreed to place the software which underwent much subsequent development
outside Exelixis, into the public domain. Forge works with most of the
common MPI implementations.
|
|