Raccoon Pipeline

Rigorous Alignment Curation: Cleanup Of Outliers and Noise

FASTA file(s)
FASTA file(s)

One or more sequence files in FASTA format

FASTA format consists of a header line starting with ">" followed by the sequence data.

A file can contain multiple sequences, each with its own header.

raccoon seq-qc
Sequence QC
combined.fasta
combined.fasta

Combined sequence file with N content and length filter, with harmonized headers populated from metadata.

By default raccoon will attempt to match sequence IDs with metadata and produce headers in the format ">sample|location|YYYY-MM-DD".

MAFFT
Multiple Sequence Alignment
alignment.fasta
alignment.fasta

Multiple sequence alignment in FASTA format

or
Reference Mapping
Alignment vs Reference
alignment.fasta
alignment.fasta

Multiple sequence alignment against reference in FASTA format

raccoon aln-qc
Alignment QC
raccoon mask (optional)
Apply Mask
alignment.masked.fasta
alignment.masked.fasta

Alignment with flagged sites masked

Optional Inputs
IQ-TREE
Maximum Likelihood Tree Estimation
output.treefile
output.treefile

Phylogenetic tree in Newick format

raccoon tree-qc
Phylogenetic QC

Legend

Sequence Data
Metadata/Reports
Phylogenetic Tree
Optional Step