Split individual sample VCFs from an aggreate VCF.
Given an aggregate VCF file containing multiple samples, split all samples into isolated VCFs, one per sample. Alternatively, specify sample(s) to split apart from VCF if not all isolated VCFs are needed.
Parameters
Name
Type
Description
Default
vcf_uri
str
Aggregate VCF URI.
required
output_uri
str
Output URI to write isolated VCFs.
required
namespace
str
TileDB Cloud namespace to process task graph.
required
acn
str
Access credential friendly name to auth storage i/o.
required
resources
Mapping[str, str]
Resources applied to splitting UDF (start with default).
{'cpu': '2', 'memory': '30Gi'}
compute
bool
Whether to execute DAG.
True
verbose
bool
Logging verbosity.
False
samples
Optional[Sequence[str]]
Indicate a batch of sample names within vcf_uri to isolate if it is undesired to isolate all samples (default).
None
retry_count
int
Number of Node retries.
1
max_workers
int
Max workers to engage simultaneously.
100
config
Optional[Mapping[str, int]]
TileDB configuration parameters used to configure virtual filesystem handler.