HomeFreeBSD

biology/vcf-split: Split a multi-sample VCF into single-sample VCFs

Description

biology/vcf-split: Split a multi-sample VCF into single-sample VCFs

Vcf-split splits a multi-sample VCF into single-sample VCFs, writing thousands
of output files simultaneously. Parsing the TOPMed human chromosome 1 BCF
with bcftools takes two days, so extracting the 137,977 samples one at a time
or using thousands of parallel readers of the same file is impractical.
Vcf-split solves this by generating thousands of single-sample outputs during
a single sweep through the multi-sample input.

Details

Provenance
jwbAuthored on
Parents
rP568921: biology/biolibc: Low-level high-performance bioinformatics library
Branches
Unknown
Tags
Unknown