[BioC] readVcf bgzip error
Davis, Brian
Brian.Davis at uth.tmc.edu
Wed Oct 10 15:33:01 CEST 2012
Valerie,
This is human subject data so I'll have to work on getting permissions to share on my end. In the mean time I'll try to reproduce with 1000 genomes data.
Brain
-----Original Message-----
From: Valerie Obenchain [mailto:vobencha at fhcrc.org]
Sent: Tuesday, October 09, 2012 6:19 PM
To: Davis, Brian
Cc: bioconductor at r-project.org
Subject: Re: [BioC] readVcf bgzip error
Hi Brian,
I'm not sure what's going on here. Can you point me to where you got this file or is it small enough to send?
Valerie
On 10/09/2012 12:06 PM, Davis, Brian wrote:
> I'm seeing an error when I read in a compress vcf, but not when I read in the uncompressed vcf. Can anyone point me in the right direction to figure out what I'm doing wrong? I've tried this on 3 different vcfs with the same error (different record fails).
>
>> # read in a complete file
>> fl<- "first10K.vcf"
>> vcf<- readVcf(fl, "hg19")
>> vcf
> class: VCF
> dim: 9934 998
> genome: hg19
> exptData(1): header
> fixed(4): REF ALT QUAL FILTER
> info(39): NS DP ... HD HP
> geno(6): GT VR ... GQ FT
> rownames(9934): 1:69270 1:69360 ... 1:19597392 1:19597396 rowData
> values names(1): paramRangeID
> colnames(998): A00003 A00057 ... A16457 ''
> colData names(1): Samples
>> # now try again but compress it first
>> fl<- "first10K.vcf"
>> compressVcf<- bgzip(fl, tempfile())
>> idx<- indexTabix(compressVcf, "vcf")
>> tab<- TabixFile(compressVcf, idx)
>> vcf<- readVcf(tab, "hg19")
> Error: scanVcf: record 4370 INFO '0/0:.:130:131:.:.' not found
> path:
> C:\Users\bdavis2\AppData\Local\Temp\RtmpwrXcST\file1dc84cff4177
>> sessionInfo()
> R version 2.15.1 (2012-06-22)
> Platform: x86_64-pc-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
> States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] VariantAnnotation_1.2.10 Rsamtools_1.8.6 Biostrings_2.24.1 GenomicRanges_1.8.13
> [5] IRanges_1.14.4 BiocGenerics_0.2.0
>
> loaded via a namespace (and not attached):
> [1] AnnotationDbi_1.18.1 Biobase_2.16.0 biomaRt_2.12.0 bitops_1.0-4.1
> [5] BSgenome_1.24.0 DBI_0.2-5 GenomicFeatures_1.8.3 grid_2.15.1
> [9] lattice_0.20-10 Matrix_1.0-9 RCurl_1.95-1.1 RSQLite_0.11.2
> [13] rtracklayer_1.16.3 snpStats_1.6.0 splines_2.15.1 stats4_2.15.1
> [17] survival_2.36-14 tools_2.15.1 XML_3.95-0.1 zlibbioc_1.2.0
>
> Brian
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list