[R] formatting a list
Charles C. Berry
cberry at tajo.ucsd.edu
Fri Oct 26 20:24:00 CEST 2007
On Fri, 26 Oct 2007, Tomas Vaisar wrote:
> Hi Chuck,
>
> I finally got to install v 2.6.0 and tried your initial suggestions - with
> the new version the
>
> dat <- as.data.frame( matrix( scan('tmp.txt'), nr=19) )
>
> did not make the list in the desired format, however the other two worked.
Tomas,
I am glad to hear that those were successful.
I believe that
dat <- as.data.frame( <etc> )
did indeed create a list in the 'desired format'. This use of
'as.data.frame' is a standard trick for turning a matrix into a list
whose componenets are the columns of the matrix (which in the above case
are the rows of your data file).
But I suspect that you printed it (or several elements like 'dat[1:3]' )
out and were fooled by what you saw.
This would happen because in this case class(dat) =='data.frame'.
data.frames are lists - try
is.list(dat)
There is a print method for data.frame, so the appearance of
print( dat[ 1:3 ] )
and
print( unclass( dat[ 1:3 ] ) )
on your screen is rather different.
Chuck
>
> Thanks a lot again.
>
> Tomas
>
> Charles C. Berry wrote:
>>
>> Tomas,
>>
>> Are you using R-2.6.0 ??
>>
>> Each method works for me producing as list of 7000 vectors.
>>
>> The file I used to test this is created by:
>>
>> for (i in 1:7000) cat( seq(from=i,by=1,length=19),"\n",
>> sep='\t',file="tmp.tab",append=TRUE)
>>
>> The first line is:
>>
>> > scan("tmp.tab",nlines=1)
>> Read 19 items
>> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
>> >
>>
>> The last line is
>>
>> > scan("tmp.tab",skip=6999,nlines=1)
>> Read 19 items
>> [1] 7000 7001 7002 7003 7004 7005 7006 7007 7008 7009 7010 7011 7012 7013
>> 7014 7015 7016 7017 7018
>> >
>>
>> and each method recapitulates this:
>>
>> > dat[[1]]
>> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
>> > dat[[7000]]
>> [1] 7000 7001 7002 7003 7004 7005 7006 7007 7008 7009 7010 7011 7012 7013
>> 7014 7015 7016 7017 7018
>> >
>>
>> The second method threw lots of warnings because open connections must be
>> closed. Those could be eliminated by explicitly opening and closing the
>> connection. Before using the second method closeAllconnections() was
>> sometimes needed, but the error it reported differs from the one you
>> mention.
>>
>> I am using
>>
>> > version
>> _
>> platform i386-pc-mingw32
>> arch i386
>> os mingw32
>> system i386, mingw32
>> status
>> major 2
>> minor 6.0
>> year 2007
>> month 10
>> day 03
>> svn rev 43063
>> language R
>> version.string R version 2.6.0 (2007-10-03)
>> >
>>
>>
>> Chuck
>>
>> On Mon, 22 Oct 2007, Tomas Vaisar wrote:
>>
>> > Hi Chuck,
>> >
>> > thanks for your responses. I did not ignore your suggestions - I did
>> > try them and they did not produce what I need.
>> >
>> > The first one produced table with the same format as a read.table would
>> > generate, not not a list of lists.
>> > Second one gave me an error after returning Read 19 items multiple times
>> > : Error in textConnection(x) : all connections are in use
>> > The last one gave me similar error on the first step - Error in
>> > file(con, "r") : all connections are in use
>> >
>> > However, your last suggestion to make list of lists seems that it
>> > works. I will have to test more.
>> >
>> > Cheers,
>> >
>> > Tomas
>> >
>> > Charles C. Berry wrote:
>> > >
>> > > Tomas,
>> > >
>> > > Three different ways to create a list of 7000 vectors from a file of
>> > > 7000 rows and 19 columns are given here:
>> > >
>> > > http://article.gmane.org/gmane.comp.lang.r.general/97032
>> > >
>> > > which I think is what you are asking for.
>> > >
>> > > If you truly need a list of 7000 lists each of length 1 containing a
>> > > vector of length 19, then do this:
>> > >
>> > > list.of.lists.of.one.vector.each <- lapply( list.of.vectors, list )
>> > >
>> > >
>> > > BTW, as this thread appears in
>> > >
>> > > http://news.gmane.org/gmane.comp.lang.r.general
>> > >
>> > > the above article was the firt reply to your original query. I am
>> > > puzzled as to why you did not simply implement one of the three
>> > > methods shown there.
>> > >
>> > > Chuck
>> > >
>> > > On Mon, 22 Oct 2007, Tomas Vaisar wrote:
>> > >
>> > > > Hi Jim,
>> > > >
>> > > > I really appreciate your help.
>> > > > From the input file I have - 19 columns, 7000 rows - the scan gives
>> > > > me
>> > > > the desired format of a list consisting of 19 lists with 7000 values
>> > > > each.
>> > > > However I need a list of 7000 lists with 19 values each. (e.g. each
>> > > > row
>> > > > of my input file should be a separate list bound in a list of all
>> > > > these
>> > > > lists)
>> > > > I use both commands you suggested -
>> > > > x <- scan('temp.txt', what=c(rep(list(0), 19)))
>> > > > followed by
>> > > > x.matrix <- do.call('rbind', x) # gives 7000 x 19 matrix.
>> > > >
>> > > > Although this makes a matrix of the correct dimensions it is not the
>> > > > "list of lists" the ROCR package expects as input. Can you convert
>> > > > this
>> > > > matrix into a "list of lists"? Or is there a simple way in R to
>> > > > convert
>> > > > a table into such a "list of lists"?
>> > > >
>> > > > Thanks again,
>> > > >
>> > > > Tomas
>> > > >
>> > > >
>> > > > jim holtman wrote:
>> > > > > That is what I thought and that is the format that the 'scan'
>> > > > > approach
>> > > > > should provide. I was just confused when you said that you were
>> > > > > going
>> > > > > to have to transpose it, write it and then read it back in for
>> > > > > some
>> > > > > reason. I understand that Excel can not handle 7000 columns, but
>> > > > > was
>> > > > > wondering where that came into play.
>> > > > >
>> > > > > On 10/21/07, Tomas Vaisar <tvaisar at u.washington.edu> wrote:
>> > > > >
>> > > > > > The data I have is tab delimited file with 7000 lines of 19
>> > > > > > values
>> > > > > > each
>> > > > > > (representing 7000 permutations on 19 variables). I want to get
>> > > > > > it
>> > > > > > into
>> > > > > > the ROCR package which expects the data to be in lists - single
>> > > > > > list of
>> > > > > > 19 values for each permutation, e.g. list of 7000 lists of 19
>> > > > > > values each.
>> > > > > >
>> > > > > > I hope this is little clearer.
>> > > > > >
>> > > > > > Tomas
>> > > > > >
>> > > > > > jim holtman wrote:
>> > > > > >
>> > > > > > > What is it that you want to do? The 'scan' statement give you
>> > > > > > > a list
>> > > > > > > of length 7000 with 19 entries each. Do you want to create a
>> > > > > > > matrix
>> > > > > > > that has 7000 rows by 19 columns? If so, then you just have
>> > > > > > > to take
>> > > > > > > the output of the 'scan' and do:
>> > > > > > >
>> > > > > > > x.matrix <- do.call('rbind', x) # gives 7000 x 19 matrix.
>> > > > > > >
>> > > > > > > So I am still not sure exactly what your input is and what you
>> > > > > > > want to
>> > > > > > > do with it.
>> > > > > > >
>> > > > > > > On 10/21/07, Tomas Vaisar <tvaisar at u.washington.edu> wrote:
>> > > > > > >
>> > > > > > >
>> > > > > > > > Hi Jim,
>> > > > > > > >
>> > > > > > > > thanks a lot. It works, however - my other problem is that
>> > > > > > > > I
>> > > > > > > > need to
>> > > > > > > > transpose the original table before reading it into the list
>> > > > > > > > because the
>> > > > > > > > data come from Excel and it can't handle 7000 columns. I
>> > > > > > > > could
>> > > > > > > > read it
>> > > > > > > > in R transpose end write into a new tab delim file and then
>> > > > > > > > read
>> > > > > > > > it back
>> > > > > > > > in, but I would think that there might be a way in R to do
>> > > > > > > > both.
>> > > > > > > > Would you know about the way?
>> > > > > > > >
>> > > > > > > > Tomas
>> > > > > > > >
>> > > > > > > > jim holtman wrote:
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > > another choice is:
>> > > > > > > > >
>> > > > > > > > > x <- scan('temp.txt', what=c(rep(list(0), 19)))
>> > > > > > > > >
>> > > > > > > > > On 10/20/07, Tomas Vaisar <tvaisar at u.washington.edu>
>> > > > > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > Hi,
>> > > > > > > > > >
>> > > > > > > > > > I am new to R and need to read in a file with 19 columns
>> > > > > > > > > > and
>> > > > > > > > > > 7000 rows
>> > > > > > > > > > and make it into a list of 7000 lists with 19 items
>> > > > > > > > > > each. For a
>> > > > > > > > > > simpler case of 10 by 10 table I used x <-scan("file",
>> > > > > > > > > > list(0,0,0,0,0,0,0,0,0,0)), perhaps clumsy, but it did
>> > > > > > > > > > the job.
>> > > > > > > > > > However with the large 19x7000 (which needs to be
>> > > > > > > > > > transposed) I
>> > > > > > > > > > am not
>> > > > > > > > > > sure how to go about it.
>> > > > > > > > > >
>> > > > > > > > > > Coudl somebody suggest a way?
>> > > > > > > > > >
>> > > > > > > > > > Thanks,
>> > > > > > > > > >
>> > > > > > > > > > Tomas
>> > > > > > > > > >
>> > > > > > > > > > ______________________________________________
>> > > > > > > > > > R-help at r-project.org mailing list
>> > > > > > > > > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > > > > > > > > PLEASE do read the posting guide
>> > > > > > > > > > http://www.R-project.org/posting-guide.html
>> > > > > > > > > > and provide commented, minimal, self-contained,
>> > > > > > > > > > reproducible code.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > >
>> > > > ______________________________________________
>> > > > R-help at r-project.org mailing list
>> > > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > > PLEASE do read the posting guide
>> > > > http://www.R-project.org/posting-guide.html
>> > > > and provide commented, minimal, self-contained, reproducible code.
>> > > >
>> > >
>> > > Charles C. Berry (858) 534-2098
>> > > Dept of Family/Preventive
>> > > Medicine
>> > > E mailto:cberry at tajo.ucsd.edu UC San Diego
>> > > http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego
>> > > 92093-0901
>> > >
>> > >
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>> Charles C. Berry (858) 534-2098
>> Dept of Family/Preventive
>> Medicine
>> E mailto:cberry at tajo.ucsd.edu UC San Diego
>> http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
>>
>>
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
More information about the R-help
mailing list