Raccoon aln-qc report

Alignment quality assessment, with potentially problematic sites and sequences flagged.

Generated 2026-03-16 21:50
ARTIC Network logo

Table of contents

  1. Report Summary
  2. Alignment N-content
  3. Flagged sites
  4. Location of flagged sites
  5. Flagged sequences
  6. Diversity

1. Report Summary

Sequences: 51
Alignment length: 3244

Mean N content: 0.0151
Mean completeness: 0.9481

2. Alignment N-content

Purple blocks represent N positions (ambiguous nucleotides) along the alignment for each sequence.
This plot may be downsampled for performance: every 1 sequence(s) and 2-bp windows are shown.

3. Flagged sites

Categories of mutations checked were clustered SNPs (≥4 SNPs within 6 bp); SNPs adjacent to Ns; SNPs adjacent to gaps; frame-breaking indels.

flagged type Minimum Maximum Length present_in note
25 site 25 25 1 PHL036|mylona_marsh|elliville|2026-02-19 N_adjacent
46 site 46 46 1 LHFV004|wilkins_sound|samford|1982 gap_adjacent
189 site 189 189 1 LHFV010|mylona_marsh|elliville|1990 N_adjacent
574 site 574 574 1 LHFV010|mylona_marsh|elliville|1990 N_adjacent
678 site 678 678 1 LHFV010|mylona_marsh|elliville|1990 N_adjacent
1160 site 1160 1160 1 LHFV001|mylona_marsh|elliville|2010 N_adjacent
2436 site 2436 2436 1 LHFV007|outer_otooles|inis_aine|2005 N_adjacent;clustered_snps
2437 site 2437 2437 1 LHFV007|outer_otooles|inis_aine|2005 N_adjacent;clustered_snps
2439 site 2439 2439 1 LHFV007|outer_otooles|inis_aine|2005 clustered_snps
2440 site 2440 2440 1 LHFV007|outer_otooles|inis_aine|2005 clustered_snps
2441 site 2441 2441 1 LHFV007|outer_otooles|inis_aine|2005 clustered_snps

4. Location of flagged sites

Categories of mutations checked were clustered SNPs (≥4 SNPs within 6 bp); SNPs adjacent to Ns; SNPs adjacent to gaps; frame-breaking indels.

5. Flagged sequences

Sequences are flagged for removal if they meet the following criteria: Sequences with more than 20 flagged site(s).

No sequences flagged for removal.

6. Diversity

This plot shows sequence diversity across the alignment as Shannon diversity, smoothed over a 5-base window.

Shannon diversity summarizes how mixed the bases are at each alignment position: low values mean most sequences share the same base, and higher values mean more diversity at a given position. For a column with possible states A/T/C/G (Ns and gaps are ignored), the theoretical range is 0 to log₂(4) ≈ 2.0.
High diversity can be a sign of true biological variation, but it can also indicate problematic sites with many sequencing errors or misalignments.

Datafiles

Alignment: cases_background.aln.fasta

Mask File: mask_sites.csv

Outdir: aln-qc

Report metadata

Command: raccoon aln-qc examples/lhfv/aln-qc/cases_background.aln.fasta -d examples/lhfv/aln-qc

Generated: 2026-03-16 21:50

Raccoon version: 1.0.2

Python: 3.14.2

Platform: Darwin 24.4.0