Raccoon seq-qc report

Combine sequence FASTA file(s) with metadata for sequence header harmonization, with sequence QC filtering of N content and length.

Generated 2026-03-16 19:12
ARTIC Network logo

Table of contents

  1. Report Summary
  2. Input files
  3. Sequence length distribution
  4. Metadata description
  5. Filtered sequences
  6. Final IDs
  7. Report metadata

1. Report Summary

FASTA files: 2
Metadata files: 2

Total sequences: 53

Filters: Minimum length: 2000
Filtered: 1

Unique locations: 19

Date range: 1982 → 2026-02-28

2. Input files

FASTA files provided: cases.fasta, historical_outbreaks.fasta.

FileSeqsLength (min)Length (max)Length (mean)N (min)N (max)N (mean)
cases.fasta 43 3107 3107 3107.0 0.0151 0.0267 0.0164
IDLengthN contentStatus
PHL043 3107 0.0151 kept
PHL036 3107 0.0151 kept
PHL029 3107 0.0151 kept
PHL042 3107 0.0151 kept
PHL039 3107 0.0151 kept
PHL009 3107 0.0151 kept
PHL034 3107 0.0151 kept
PHL037 3107 0.0183 kept
PHL027 3107 0.0183 kept
PHL024 3107 0.0151 kept
PHL025 3107 0.0193 kept
PHL031 3107 0.0151 kept
PHL038 3107 0.0151 kept
PHL035 3107 0.0151 kept
PHL030 3107 0.0151 kept
PHL033 3107 0.0151 kept
PHL041 3107 0.0151 kept
PHL001 3107 0.0267 kept
PHL032 3107 0.0151 kept
PHL002 3107 0.0203 kept
PHL003 3107 0.0203 kept
PHL018 3107 0.0151 kept
PHL040 3107 0.0151 kept
PHL020 3107 0.0151 kept
PHL019 3107 0.0151 kept
PHL014 3107 0.0151 kept
PHL028 3107 0.0151 kept
PHL021 3107 0.0196 kept
PHL016 3107 0.0151 kept
PHL013 3107 0.02 kept
PHL026 3107 0.0151 kept
PHL010 3107 0.0151 kept
PHL007 3107 0.0151 kept
PHL023 3107 0.0151 kept
PHL022 3107 0.0151 kept
PHL008 3107 0.0151 kept
PHL017 3107 0.0151 kept
PHL004 3107 0.0206 kept
PHL015 3107 0.0151 kept
PHL005 3107 0.0151 kept
PHL012 3107 0.0151 kept
PHL011 3107 0.0212 kept
PHL006 3107 0.0151 kept
FileSeqsLength (min)Length (max)Length (mean)N (min)N (max)N (mean)
historical_outbreaks.fasta 10 132 3244 2908.9 0.0 0.8237 0.0922
IDLengthN contentStatus
LHFV001 3244 0.0003 kept
LHFV002 3210 0.0 kept
LHFV003 3244 0.0 kept
LHFV004 3196 0.0 kept
LHFV005 3225 0.0 kept
LHFV006 3213 0.0 kept
LHFV007 3200 0.0006 kept
LHFV008 132 0.0 filtered
LHFV009 3200 0.8237 kept
LHFV010 3225 0.0971 kept

Metadata files provided: case_metadata.csv, historical_outbreaks_metadata.tsv.

3. Sequence length distribution

Distribution of sequence lengths across all input FASTA files. If a minimum length threshold is specified, it will be shown in the plot.

4. Metadata description

Unique locations: 19

Date range: 1982 → 2026-02-28

Date distribution by location.

5. Filtered sequences

The following sequences failed the length or N-content filters and will not be present in the output FASTA file:

file id parsed_id length n_content reason
historical_outbreaks.fasta LHFV008 LHFV008 132 0.0 length < 2000

6. Final dataset

IDLengthN content
PHL043|mylona_marsh|duggan_demesne|2026-02-28 3107 0.0151
PHL036|mylona_marsh|elliville|2026-02-19 3107 0.0151
PHL029|mylona_marsh|myra_myre|2026-02-14 3107 0.0151
PHL042|mylona_marsh|myra_myre|2026-02-27 3107 0.0151
PHL039|mylona_marsh|myra_myre|2026-02-24 3107 0.0151
PHL009|mylona_marsh|morang_a_moor|2026-01-20 3107 0.0151
PHL034|mylona_marsh|joyce_jungle|2026-02-18 3107 0.0151
PHL037|mylona_marsh|joyce_jungle|2026-02-20 3107 0.0183
PHL027|mylona_marsh|myra_myre|2026-02-11 3107 0.0183
PHL024|mylona_marsh|lusamaki_lake|2026-02-10 3107 0.0151
PHL025|mylona_marsh|elliville|2026-02-10 3107 0.0193
PHL031|mylona_marsh|elliville|2026-02-16 3107 0.0151
PHL038|mylona_marsh|elliville|2026-02-20 3107 0.0151
PHL035|mylona_marsh|elliville|2026-02-18 3107 0.0151
PHL030|mylona_marsh|elliville|2026-02-15 3107 0.0151
PHL033|mylona_marsh|elliville|2026-02-17 3107 0.0151
PHL041|mylona_marsh|lililand|2026-02-26 3107 0.0151
PHL001|mylona_marsh|morang_a_moor|2025-12-03 3107 0.0267
PHL032|lomanland|nickopolis|2026-02-16 3107 0.0151
PHL002|mylona_marsh|morang_a_moor|2025-12-08 3107 0.0203
PHL003|mylona_marsh|morang_a_moor|2025-12-28 3107 0.0203
PHL018|mylona_marsh|elliville|2026-02-07 3107 0.0151
PHL040|mylona_marsh|elliville|2026-02-24 3107 0.0151
PHL020|mylona_marsh|elliville|2026-02-08 3107 0.0151
PHL019|mylona_marsh|lililand|2026-02-07 3107 0.0151
PHL014|mylona_marsh|morang_a_moor|2026-02-01 3107 0.0151
PHL028|mylona_marsh|morang_a_moor|2026-02-11 3107 0.0151
PHL021|mylona_marsh|lililand|2026-02-08 3107 0.0196
PHL016|mylona_marsh|lililand|2026-02-05 3107 0.0151
PHL013|mylona_marsh|lililand|2026-01-30 3107 0.02
PHL026|mylona_marsh|elliville|2026-02-10 3107 0.0151
PHL010|mylona_marsh|bedeburgh|2026-01-20 3107 0.0151
PHL007|mylona_marsh|bedeburgh|2026-01-10 3107 0.0151
PHL023|lomanland|nickopolis|2026-02-09 3107 0.0151
PHL022|lomanland|nickopolis|2026-02-08 3107 0.0151
PHL008|mylona_marsh|faux_kent|2026-01-17 3107 0.0151
PHL017|mylona_marsh|donkor_dale|2026-02-06 3107 0.0151
PHL004|mylona_marsh|quaye_quay|2026-01-02 3107 0.0206
PHL015|mylona_marsh|maloney_mere|2026-02-01 3107 0.0151
PHL005|mylona_marsh|donkor_dale|2026-01-03 3107 0.0151
PHL012|mylona_marsh|lusamaki_lake|2026-01-26 3107 0.0151
PHL011|mylona_marsh|elliville|2026-01-23 3107 0.0212
PHL006|mylona_marsh|elliville|2026-01-05 3107 0.0151
LHFV001|mylona_marsh|elliville|2010 3244 0.0003
LHFV002|lomanland|radsborough|2006 3210 0.0
LHFV003|goodfellow_forest|glenian|2007 3244 0.0
LHFV004|wilkins_sound|samford|1982 3196 0.0
LHFV005|goodfellow_forest|glenian|1990 3225 0.0
LHFV006|willmott_woods|hannahland|2019 3213 0.0
LHFV007|outer_otooles|inis_aine|2005 3200 0.0006
LHFV009|lomanland|centre|2005 3200 0.8237
LHFV010|mylona_marsh|elliville|1990 3225 0.0971

7. Metadata issues

No metadata issues detected.

Datafiles

Inputs: cases.fasta, historical_outbreaks.fasta

Metadata: case_metadata.csv, historical_outbreaks_metadata.tsv

Output: cases_background.seq-qc.fasta

Report metadata

Command: raccoon seq-qc --metadata examples/lhfv/input_files/case_metadata.csv examples/lhfv/input_files/historical_outbreaks_metadata.tsv -o examples/lhfv/seq-qc/cases_background.seq-qc.fasta -f examples/lhfv/input_files/cases.fasta examples/lhfv/input_files/historical_outbreaks.fasta --header-fields '{sample}|{admin1}|{admin2}|{date}' --metadata-date-field date --metadata-location-field admin2 --min-length 2000

Generated: 2026-03-16 19:12

Raccoon version: 1.0.2

Python: 3.14.2

Platform: Darwin 24.4.0