Combine sequence FASTA file(s) with metadata for sequence header harmonization, with sequence QC filtering of N content and length.
FASTA files: 2
Metadata files: 2
Total sequences: 53
Filters: Minimum length: 2000
Filtered: 1
Unique locations: 19
Date range: 1982 → 2026-02-28
FASTA files provided: cases.fasta, historical_outbreaks.fasta.
| File | Seqs | Length (min) | Length (max) | Length (mean) | N (min) | N (max) | N (mean) |
|---|---|---|---|---|---|---|---|
| cases.fasta | 43 | 3107 | 3107 | 3107.0 | 0.0151 | 0.0267 | 0.0164 |
| ID | Length | N content | Status |
|---|---|---|---|
| PHL043 | 3107 | 0.0151 | kept |
| PHL036 | 3107 | 0.0151 | kept |
| PHL029 | 3107 | 0.0151 | kept |
| PHL042 | 3107 | 0.0151 | kept |
| PHL039 | 3107 | 0.0151 | kept |
| PHL009 | 3107 | 0.0151 | kept |
| PHL034 | 3107 | 0.0151 | kept |
| PHL037 | 3107 | 0.0183 | kept |
| PHL027 | 3107 | 0.0183 | kept |
| PHL024 | 3107 | 0.0151 | kept |
| PHL025 | 3107 | 0.0193 | kept |
| PHL031 | 3107 | 0.0151 | kept |
| PHL038 | 3107 | 0.0151 | kept |
| PHL035 | 3107 | 0.0151 | kept |
| PHL030 | 3107 | 0.0151 | kept |
| PHL033 | 3107 | 0.0151 | kept |
| PHL041 | 3107 | 0.0151 | kept |
| PHL001 | 3107 | 0.0267 | kept |
| PHL032 | 3107 | 0.0151 | kept |
| PHL002 | 3107 | 0.0203 | kept |
| PHL003 | 3107 | 0.0203 | kept |
| PHL018 | 3107 | 0.0151 | kept |
| PHL040 | 3107 | 0.0151 | kept |
| PHL020 | 3107 | 0.0151 | kept |
| PHL019 | 3107 | 0.0151 | kept |
| PHL014 | 3107 | 0.0151 | kept |
| PHL028 | 3107 | 0.0151 | kept |
| PHL021 | 3107 | 0.0196 | kept |
| PHL016 | 3107 | 0.0151 | kept |
| PHL013 | 3107 | 0.02 | kept |
| PHL026 | 3107 | 0.0151 | kept |
| PHL010 | 3107 | 0.0151 | kept |
| PHL007 | 3107 | 0.0151 | kept |
| PHL023 | 3107 | 0.0151 | kept |
| PHL022 | 3107 | 0.0151 | kept |
| PHL008 | 3107 | 0.0151 | kept |
| PHL017 | 3107 | 0.0151 | kept |
| PHL004 | 3107 | 0.0206 | kept |
| PHL015 | 3107 | 0.0151 | kept |
| PHL005 | 3107 | 0.0151 | kept |
| PHL012 | 3107 | 0.0151 | kept |
| PHL011 | 3107 | 0.0212 | kept |
| PHL006 | 3107 | 0.0151 | kept |
| File | Seqs | Length (min) | Length (max) | Length (mean) | N (min) | N (max) | N (mean) |
|---|---|---|---|---|---|---|---|
| historical_outbreaks.fasta | 10 | 132 | 3244 | 2908.9 | 0.0 | 0.8237 | 0.0922 |
| ID | Length | N content | Status |
|---|---|---|---|
| LHFV001 | 3244 | 0.0003 | kept |
| LHFV002 | 3210 | 0.0 | kept |
| LHFV003 | 3244 | 0.0 | kept |
| LHFV004 | 3196 | 0.0 | kept |
| LHFV005 | 3225 | 0.0 | kept |
| LHFV006 | 3213 | 0.0 | kept |
| LHFV007 | 3200 | 0.0006 | kept |
| LHFV008 | 132 | 0.0 | filtered |
| LHFV009 | 3200 | 0.8237 | kept |
| LHFV010 | 3225 | 0.0971 | kept |
Metadata files provided: case_metadata.csv, historical_outbreaks_metadata.tsv.
Distribution of sequence lengths across all input FASTA files. If a minimum length threshold is specified, it will be shown in the plot.
Unique locations: 19
Date range: 1982 → 2026-02-28
Date distribution by location.
The following sequences failed the length or N-content filters and will not be present in the output FASTA file:
| file | id | parsed_id | length | n_content | reason |
|---|---|---|---|---|---|
| historical_outbreaks.fasta | LHFV008 | LHFV008 | 132 | 0.0 | length < 2000 |
| ID | Length | N content |
|---|---|---|
| PHL043|mylona_marsh|duggan_demesne|2026-02-28 | 3107 | 0.0151 |
| PHL036|mylona_marsh|elliville|2026-02-19 | 3107 | 0.0151 |
| PHL029|mylona_marsh|myra_myre|2026-02-14 | 3107 | 0.0151 |
| PHL042|mylona_marsh|myra_myre|2026-02-27 | 3107 | 0.0151 |
| PHL039|mylona_marsh|myra_myre|2026-02-24 | 3107 | 0.0151 |
| PHL009|mylona_marsh|morang_a_moor|2026-01-20 | 3107 | 0.0151 |
| PHL034|mylona_marsh|joyce_jungle|2026-02-18 | 3107 | 0.0151 |
| PHL037|mylona_marsh|joyce_jungle|2026-02-20 | 3107 | 0.0183 |
| PHL027|mylona_marsh|myra_myre|2026-02-11 | 3107 | 0.0183 |
| PHL024|mylona_marsh|lusamaki_lake|2026-02-10 | 3107 | 0.0151 |
| PHL025|mylona_marsh|elliville|2026-02-10 | 3107 | 0.0193 |
| PHL031|mylona_marsh|elliville|2026-02-16 | 3107 | 0.0151 |
| PHL038|mylona_marsh|elliville|2026-02-20 | 3107 | 0.0151 |
| PHL035|mylona_marsh|elliville|2026-02-18 | 3107 | 0.0151 |
| PHL030|mylona_marsh|elliville|2026-02-15 | 3107 | 0.0151 |
| PHL033|mylona_marsh|elliville|2026-02-17 | 3107 | 0.0151 |
| PHL041|mylona_marsh|lililand|2026-02-26 | 3107 | 0.0151 |
| PHL001|mylona_marsh|morang_a_moor|2025-12-03 | 3107 | 0.0267 |
| PHL032|lomanland|nickopolis|2026-02-16 | 3107 | 0.0151 |
| PHL002|mylona_marsh|morang_a_moor|2025-12-08 | 3107 | 0.0203 |
| PHL003|mylona_marsh|morang_a_moor|2025-12-28 | 3107 | 0.0203 |
| PHL018|mylona_marsh|elliville|2026-02-07 | 3107 | 0.0151 |
| PHL040|mylona_marsh|elliville|2026-02-24 | 3107 | 0.0151 |
| PHL020|mylona_marsh|elliville|2026-02-08 | 3107 | 0.0151 |
| PHL019|mylona_marsh|lililand|2026-02-07 | 3107 | 0.0151 |
| PHL014|mylona_marsh|morang_a_moor|2026-02-01 | 3107 | 0.0151 |
| PHL028|mylona_marsh|morang_a_moor|2026-02-11 | 3107 | 0.0151 |
| PHL021|mylona_marsh|lililand|2026-02-08 | 3107 | 0.0196 |
| PHL016|mylona_marsh|lililand|2026-02-05 | 3107 | 0.0151 |
| PHL013|mylona_marsh|lililand|2026-01-30 | 3107 | 0.02 |
| PHL026|mylona_marsh|elliville|2026-02-10 | 3107 | 0.0151 |
| PHL010|mylona_marsh|bedeburgh|2026-01-20 | 3107 | 0.0151 |
| PHL007|mylona_marsh|bedeburgh|2026-01-10 | 3107 | 0.0151 |
| PHL023|lomanland|nickopolis|2026-02-09 | 3107 | 0.0151 |
| PHL022|lomanland|nickopolis|2026-02-08 | 3107 | 0.0151 |
| PHL008|mylona_marsh|faux_kent|2026-01-17 | 3107 | 0.0151 |
| PHL017|mylona_marsh|donkor_dale|2026-02-06 | 3107 | 0.0151 |
| PHL004|mylona_marsh|quaye_quay|2026-01-02 | 3107 | 0.0206 |
| PHL015|mylona_marsh|maloney_mere|2026-02-01 | 3107 | 0.0151 |
| PHL005|mylona_marsh|donkor_dale|2026-01-03 | 3107 | 0.0151 |
| PHL012|mylona_marsh|lusamaki_lake|2026-01-26 | 3107 | 0.0151 |
| PHL011|mylona_marsh|elliville|2026-01-23 | 3107 | 0.0212 |
| PHL006|mylona_marsh|elliville|2026-01-05 | 3107 | 0.0151 |
| LHFV001|mylona_marsh|elliville|2010 | 3244 | 0.0003 |
| LHFV002|lomanland|radsborough|2006 | 3210 | 0.0 |
| LHFV003|goodfellow_forest|glenian|2007 | 3244 | 0.0 |
| LHFV004|wilkins_sound|samford|1982 | 3196 | 0.0 |
| LHFV005|goodfellow_forest|glenian|1990 | 3225 | 0.0 |
| LHFV006|willmott_woods|hannahland|2019 | 3213 | 0.0 |
| LHFV007|outer_otooles|inis_aine|2005 | 3200 | 0.0006 |
| LHFV009|lomanland|centre|2005 | 3200 | 0.8237 |
| LHFV010|mylona_marsh|elliville|1990 | 3225 | 0.0971 |
No metadata issues detected.
Inputs: cases.fasta, historical_outbreaks.fasta
Metadata: case_metadata.csv, historical_outbreaks_metadata.tsv
Output: cases_background.seq-qc.fasta
Command: raccoon seq-qc --metadata examples/lhfv/input_files/case_metadata.csv examples/lhfv/input_files/historical_outbreaks_metadata.tsv -o examples/lhfv/seq-qc/cases_background.seq-qc.fasta -f examples/lhfv/input_files/cases.fasta examples/lhfv/input_files/historical_outbreaks.fasta --header-fields '{sample}|{admin1}|{admin2}|{date}' --metadata-date-field date --metadata-location-field admin2 --min-length 2000
Generated: 2026-03-16 19:12
Raccoon version: 1.0.2
Python: 3.14.2
Platform: Darwin 24.4.0