[Bioc-sig-seq] Problem loading MAQ map file in ShortRead
Martin Morgan
mtmorgan at fhcrc.org
Thu Apr 30 20:47:46 CEST 2009
Hi Gordon --
Gordon Robertson wrote:
> Thanks, Martin,
>
> Given the folder contents, the original pattern would have matched both
> ‘map’ and ‘mapstats’:
> {xhost06}/archive/solexa1_4/analysis/HS1035/30WLMAAXX_5/maq> ll
> ...
> -rw-rw-r-- 1 rvarhol slx_service 508774 Apr 22 15:52 30WLMAAXX_5.map
> -rw-rw-r-- 1 rvarhol slx_service 2 Apr 22 15:52 30WLMAAXX_5.mapstats
> ...
>
> The more specific pattern that you suggested worked:
>> readAligned("/archive/solexa1_4/analysis/HS1035/30WLMAAXX_5/maq/",
> pattern="^30WLMAAXX_5.map$", type="MAQMap")
> class: AlignedRead
> length: 9057150 reads; width: 50 cycles
> chromosome: 1 1 ... Y Y
> position: 2 9 ... 57442746 57443286
> strand: - - ... - -
> alignQuality: IntegerQuality
> alignData varLabels: nMismatchBestHit mismatchQuality nExactMatch24
> nOneMismatch24
> ...
>
> Note though that when I specified the incorrect type=MAQMapview, pattern
> ambiguity seemed not to be an issue. I’d expect to see the ambiguity
> error in both MAQMap and MAQMapview cases.
yes and this is also the reason for the poor error message -- MQAMap
differs from most input functions in that it really expects just a
single file for input; MAQMapview would have happily read in (to a
single AlignedRead object) all files matching the pattern.
I'll try to improve the error message (in the development branch).
Martin
> G
>
> On 4/30/09 11:16 AM, "Martin Morgan" <mtmorgan at fhcrc.org> wrote:
>
> Hi Gordon --
>
> Gordon Robertson wrote:
> > I'm starting to learn to use ShortRead, so apologize if I've missed
> > something simple. I've read the relevant sections of the three PDF
> docs that
> > are available from the ShortRead web page.
> >
> > I'm using R-2.9.0, compiled from source on 64-bit RehHat
> Enterprise Linux,
> > and installed ShortRead within the past two weeks.
> >
> > Briefly, I'm able to load an Illumina Export file, but not the
> corresponding
> > Maq file. I don't understand the MAQmap error message, but I don't
> get such
> > an error when I change the dirPath, pattern and type and rerun to
> load the
> > SolexaExport file, which is in the folder above the 'map' file.
> And I seem
> > to be able to start to load the 'map' file using an incorrect
> 'MAQMapview'
> > type.
> >
> > 1. Attempt to read a MAQ 0.7.1 'map' file
> > > readAligned("/archive/solexa1_4/analysis/HS1035/30WLMAAXX_5/maq/",
> > pattern="30WLMAAXX_5.map", type="MAQMap")
> > Error: UserArgumentMismatch
> > 'dirPath', 'pattern' must be 'character(1)'
>
> I think the problem is that your 'pattern' isn't specific enough, e.g.,
>
> list.files("/archive/solexa1_4/analysis/HS1035/30WLMAAXX_5/maq/",
> pattern="30WLMAAXX_5.map")
>
> matches more than one file; often "^30WLMAAXX_5.map$" is more
> appropriate.
>
> Martin
>
> >
> > 2. Try the Illumina 'export' file
> > > readAligned("/archive/solexa1_4/analysis/HS1035/30WLMAAXX_5/",
> > pattern="30WLMAAXX_5_export.txt", type="SolexaExport")
> > class: AlignedRead
> > length: 15256798 reads; width: 76 cycles
> > chromosome: QC QC ... QC QC
> > position: NA NA ... NA NA
> > strand: NA NA ... NA NA
> > alignQuality: NumericQuality
> > alignData varLabels: run lane ... y filtering
> >
> > 3. As a test, I can apparently start to load the 'map' file using
> > type=MAQMapview.
> > > readAligned('/archive/solexa1_4/analysis/HS1035/30WLMAAXX_5/maq/',
> > pattern='30WLMAAXX_5.map', type='MAQMapview')
> > Error: Input/Output
> > 'readAligned' failed to parse files
> > dirPath: '/archive/solexa1_4/analysis/HS1035/30WLMAAXX_5/maq/'
> > pattern: '30WLMAAXX_5.map'
> > type: 'MAQMapview'
> > error: scan() expected 'an integer', got
> >
> 'vv####################]}################?##Mq###7,##^|D#####X~!D######'
> >
> > --
> > Could you help me understand why I'm unable to readAligned the
> 'map' file?
> > Again, I apologize if I've missed something simple. Thanks for
> your help.
> >
> > G
>
>
> --
> Martin Morgan
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793
>
>
>
> --
> Gordon Robertson
> Gene Regulation Informatics
> Canada's Michael Smith Genome Sciences Centre
> Vancouver BC Canada
> www.bcgsc.ca
>
--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioc-sig-sequencing
mailing list