Alignment quality assessment, with potentially problematic sites and sequences flagged.
Sequences: 51
Alignment length: 3244
Mean N content: 0.0151
Mean completeness: 0.9481
Purple blocks represent N positions (ambiguous nucleotides) along the alignment for each sequence.
This plot may be downsampled for performance: every 1 sequence(s) and 2-bp windows are shown.
Categories of mutations checked were clustered SNPs (≥4 SNPs within 6 bp); SNPs adjacent to Ns; SNPs adjacent to gaps; frame-breaking indels.
| flagged | type | Minimum | Maximum | Length | present_in | note |
|---|---|---|---|---|---|---|
| 25 | site | 25 | 25 | 1 | PHL036|mylona_marsh|elliville|2026-02-19 | N_adjacent |
| 46 | site | 46 | 46 | 1 | LHFV004|wilkins_sound|samford|1982 | gap_adjacent |
| 189 | site | 189 | 189 | 1 | LHFV010|mylona_marsh|elliville|1990 | N_adjacent |
| 574 | site | 574 | 574 | 1 | LHFV010|mylona_marsh|elliville|1990 | N_adjacent |
| 678 | site | 678 | 678 | 1 | LHFV010|mylona_marsh|elliville|1990 | N_adjacent |
| 1160 | site | 1160 | 1160 | 1 | LHFV001|mylona_marsh|elliville|2010 | N_adjacent |
| 2436 | site | 2436 | 2436 | 1 | LHFV007|outer_otooles|inis_aine|2005 | N_adjacent;clustered_snps |
| 2437 | site | 2437 | 2437 | 1 | LHFV007|outer_otooles|inis_aine|2005 | N_adjacent;clustered_snps |
| 2439 | site | 2439 | 2439 | 1 | LHFV007|outer_otooles|inis_aine|2005 | clustered_snps |
| 2440 | site | 2440 | 2440 | 1 | LHFV007|outer_otooles|inis_aine|2005 | clustered_snps |
| 2441 | site | 2441 | 2441 | 1 | LHFV007|outer_otooles|inis_aine|2005 | clustered_snps |
Categories of mutations checked were clustered SNPs (≥4 SNPs within 6 bp); SNPs adjacent to Ns; SNPs adjacent to gaps; frame-breaking indels.
Sequences are flagged for removal if they meet the following criteria: Sequences with more than 20 flagged site(s).
No sequences flagged for removal.
This plot shows sequence diversity across the alignment as Shannon diversity, smoothed over a 5-base window.
Shannon diversity summarizes how mixed the bases are at each alignment position: low values mean most sequences share the same base,
and higher values mean more diversity at a given position. For a column with possible states A/T/C/G (Ns and gaps are ignored), the theoretical range is 0 to log₂(4) ≈ 2.0.
High diversity can be a sign of true biological variation, but it can also indicate problematic sites with many sequencing errors or misalignments.
Alignment: cases_background.aln.fasta
Mask File: mask_sites.csv
Outdir: aln-qc
Command: raccoon aln-qc examples/lhfv/aln-qc/cases_background.aln.fasta -d examples/lhfv/aln-qc
Generated: 2026-03-16 21:50
Raccoon version: 1.0.2
Python: 3.14.2
Platform: Darwin 24.4.0