Add Description
Arguments
- vcf_file
Path to the vcf file
- id
ID of the sample to select from VCF. If
NULL
, then the first sample will be selected. DefaultNULL
.- rename
Rename the sample to this value when extracting variants. If
NULL
, then the sample will be named according toID
.- sample_field
Some algoriths will save the name of the sample in the ##SAMPLE portion of header in the VCF (e.g. ##SAMPLE=<ID=TUMOR,SampleName=TCGA-01-0001>). If the ID is specified via the
id
parameter ("TUMOR" in this example), thensample_field
can be used to specify the name of the tag ("SampleName" in this example). DefaultNULL
.- filename_as_id
If set to
TRUE
, the file name will be used as the sample name.- strip_extension
Only used if
filename_as_id
is set toTRUE
. If set toTRUE
, the file extention will be stripped from the filename before setting the sample name. If a character vector is given, then all the strings in the vector will removed from the end of the filename before setting the sample name. Defaultc(".vcf",".vcf.gz",".gz")
- filter
Exclude variants that do not have a
PASS
in theFILTER
column of the VCF. DefaultTRUE
.- multiallele
Multialleles are when multiple alternative variants are listed in the same row in the vcf. One of
"expand"
or"exclude"
. If"expand"
is selected, then each alternate allele will be given their own rows. If"exclude"
is selected, then these rows will be removed. Default"expand"
.- extra_fields
Optionally extract additional fields from the
INFO
section of the VCF. DefaultNULL
.- fix_vcf_errors
Attempt to automatically fix VCF file formatting errors.
Examples
vcf <- system.file("extdata", "public_LUAD_TCGA-97-7938.vcf",
package = "musicatk"
)
variants <- extract_variants_from_vcf_file(vcf_file = vcf)