Sniffles
GitHub: https://github.com/fritzsedlazeck/Sniffles
Installation
This Python package takes advantage of SVanalyzer, i.e.,
module load ceuadmin/SVanalyzer
conda install sniffles
sniffles --version
Example
For the GIAB PackBio Hi-Fi Ashkenazi Trio data,
# wget (-c to resume)
wget ftp://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/HG002_NA24385_son/PacBio_HiFi-Revio_20231031/HG002_PacBio-HiFi-Revio_20231031_48x_GRCh38-GIABv3.bam*
# rsync can check directory
rsync --partial --progress -av \
rsync://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/HG002_NA24385_son/PacBio_HiFi-Revio_20231031/* .
wget ftp://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/release/references/GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta.gz*
sniffles \
--input HG002_PacBio-HiFi-Revio_20231031_48x_GRCh38-GIABv3.bam \
--reference GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta.gz \
--vcf HG002_PacBio-HiFi-Revio_20231031_48x_GRCh38-GIABv3.vcf.gz \
--threads 8
module load ceuadmin/bcftools
bcftools query -f "%CHROM\t%POS\t%ID\t%REF\t%ALT\t%QUAL\t%FILTER\t%FORMAT\n" \
HG002_PacBio-HiFi-Revio_20231031_48x_GRCh38-GIABv3.vcf.gz -H | head -2
We see
Generating index for HG002_PacBio-HiFi-Revio_20231031_48x_GRCh38-GIABv3.vcf.gz...
Indexing VCF output took 0.15s.
Done.
Wrote 28267 called SVs to HG002_PacBio-HiFi-Revio_20231031_48x_GRCh38-GIABv3.vcf.gz (single-sample, sorted, bgzipped, tabix-indexed)
bcftools query -f "%CHROM\t%POS\t%ID\t%REF\t%ALT\t%QUAL\t%FILTER\t\n" HG002_PacBio-HiFi-Revio_20231031_48x_GRCh38-GIABv3.vcf.gz -H | head -2
#[1]CHROM [2]POS [3]ID [4]REF [5]ALT [6]QUAL [7]FILTER
chr1 10863 Sniffles2.INS.4S0 N CAGGCGCAGAGAGGCGCGCCGCGCCGGCGCAGGCGCAGAGAGGCGCGCCGCGCCGGCGCAGGCGCAGAGAGGCGCGCCGCGCCGGCGCAGGCGCAGAGACACATGCTAGCGCGTCCAGGGGAGGAGGCGTGGCA 33 PASS
...