[R] extracting rows and columns from a big matrix

William Dunlap wdunlap at tibco.com
Mon Jul 16 19:39:13 CEST 2012


Did you ever show the code that caused the problem?
In particular, was it one very long line of code?  It is possible
that copying and pasting a long line into R might cause problems,
but the details would depend on which OS you are using and
which user interface you are using.  The "+" prompt means
that R did not see the end of an expression, typically because
of an unmatched quotation mark or left parenthesis.

Of course, the whole line might have been entered and there
may have been a missing ")" or "'" in it, but no one can tell
without seeing the code.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
> Behalf Of A J
> Sent: Monday, July 16, 2012 9:56 AM
> To: smartpink111 at yahoo.com
> Cc: r-help at r-project.org
> Subject: Re: [R] extracting rows and columns from a big matrix
> 
> 
> I guess R is enough poweful to subset  greater than 300x300. But if it is a problem
> regarding dataset I am wondering why codes are working splitting the 1788 journal
> label set into 6 groups of around 300 labels... Summarizing: if you put the whole set it
> doesn't work. If you divide into groups of around 300 labels, it does.
> 
> I keep thinking it's a liitle bit strange. I have checked labels from matrix and there is
> not any inconsistency
> 
> I will try to get different groups and then I will join them with "merge" function. I
> think maybe not the best solution, but at least, I hope it works.
> 
> Thanks everybody!
> 
> AJ
> 
> 
> > Date: Mon, 16 Jul 2012 09:08:15 -0700
> > From: smartpink111 at yahoo.com
> > Subject: Re: [R] extracting rows and columns from a big matrix
> > To: anxusgo at hotmail.com
> > CC: r-help at r-project.org
> >
> > Hi,
> >
> > If you think that R may not be able to subset greater than 300X300,
> >
> > Try this:
> > m<-matrix(numeric(350*2000),ncol=2000)
> >   colnames(m)<-paste("X",1:2000,sep="")
> >  rownames(m)<-paste("X",1:350,sep="")
> >   m[c("X6","X20","X151","X180"),c("X25","X150","X1500","X1750")]
> >      X25 X150 X1500 X1750
> > X6     0    0     0     0
> > X20    0    0     0     0
> > X151   0    0     0     0
> > X180   0    0     0     0
> >
> >
> >
> > #So, I guess there might be some problems in your dataset.
> >
> > A.K.
> >
> >
> >
> >
> >
> > ________________________________
> > From: A J <anxusgo at hotmail.com>
> > To: smartpink111 at yahoo.com
> > Sent: Monday, July 16, 2012 10:37 AM
> > Subject: RE: [R] extracting rows and columns from a big matrix
> >
> >
> >
> > Hello again!
> >
> >
> > Sorry for the inconvenience and thanks to everybody trying to help me. The steps I
> followed in the proccess are thoses:
> >
> >
> > 1) Open the .csv file containing the large matrix (15000 rows x 15000 columns) using
> write.table
> > 2) If I try to subset the total number of columns and/or rows that I need, just 1788
> ones (resulting a new square submatrix), R don't permit to do it and at the end of the
> console return the "+" sign
> > 3) In order to check there is no mistakes I have copied labels from the .csv file and I
> have compared with the original database comprising all the data. There is no
> mistakes.
> > 4) After seeing all data in Excel I decided to split the number of columns that I need
> to subset it in different parts. Developing several tests, I have checked that R works if
> the number of columns that I require is not higher than 300 (maybe a little bit higher,
> but I don't want to waste time executing so many tests).
> > 5) I have thought the best solution maybe to divide data in different groups of
> around 300 rows x 300 colums submatrices and then, join them using, for instance,
> "merge" function to get the final square submatrix of 1788 x 1788.
> >
> >
> > I think all is really strange, but I have developed several tests and different methods
> and I can't find a good and consistent argument. Perhaps label length have some
> connection, but I am not sure. I will inform about the results.
> >
> >
> > Greetings and thanks again
> >
> >
> > AJ
> >
> >
> > > Date: Mon, 16 Jul 2012 07:05:44 -0700
> > > From: smartpink111 at yahoo.com
> > > Subject: Re: [R] extracting rows and columns from a big matrix
> > > To: anxusgo at hotmail.com
> > >
> > > Hello AJ,
> > >
> > > If I understand your email, there is no problem in subsetting n (say 300 or 400)
> number of columns from the  1st and 2nd splitted ones (447 columns).  Try saving the
> third and fourth set using write.csv and open it in excel to see for any anomalies.
> How did you split the files?  Is it after reading it in R?
> > >
> > >
> > > A.K.
> > >
> > >
> > >
> > >
> > > ________________________________
> > > From: A J <anxusgo at hotmail.com>
> > > To: smartpink111 at yahoo.com
> > > Cc: r-help at r-project.org
> > > Sent: Monday, July 16, 2012 9:10 AM
> > > Subject: RE: [R] extracting rows and columns from a big matrix
> > >
> > >
> > >
> > > Yes, I have tried it and this works.
> > >
> > >
> > > Indeed, if I use a small number of colums, all the methods proposed here are
> working. Following the previous mail I have splited the number of colums in 4 parts of
> 447 colums each one. The first and the second ones work weel, but this doesn't occur
> with third and fourth parts. I am convinced it's not a problem with quotes, because I
> tried to remove them, and again the code in first and second part worked well. Now I
> have copied all labels directly from original matrix in txt file not to have any other
> mistakes. I will inform you about the enigmatic problem when I find it (I hope so...).
> > >
> > >
> > > Thanks for your comments and help.
> > >
> > >
> > > AJ
> > >
> > >
> > >
> > >
> > > > Date: Mon, 16 Jul 2012 05:46:46 -0700
> > > > From: smartpink111 at yahoo.com
> > > > Subject: Re: [R] extracting rows and columns from a big matrix
> > > > To: anxusgo at hotmail.com
> > > > CC: r-help at r-project.org
> > > >
> > > > Hello,
> > > >
> > > > Have you tried subsetting smaller number of columns (say 5 or 6) from the 2000
> column dataset?  If it is not working, then there might be problems in reading the
> dataset.
> > > >
> > > > A.K.
> > > >
> > > >
> > > >
> > > >
> > > > ________________________________
> > > > From: A J <anxusgo at hotmail.com>
> > > > To: smartpink111 at yahoo.com
> > > > Cc: r-help at r-project.org
> > > > Sent: Monday, July 16, 2012 6:49 AM
> > > > Subject: RE: [R] extracting rows and columns from a big matrix
> > > >
> > > >
> > > >
> > > > Thank you very much to everybody for your fast respones.
> > > >
> > > >
> > > > All your solutions are working well, but I keep with the same problem. When I
> use whatever of your proposals with a small set of colums (and/or rows), this work,
> but when I use the whole set of columns (and/or rows) comprising around 2000
> columns, the system don't return me the submatrix specified and prompt sign ">" is
> replaced by "+" one at the end of the console. May this be due to a limitation in
> subsetting matrices?
> > > >
> > > >
> > > > This is an example code working and using only columns:
> > > >
> > > >
> > > > m<-read.table("C:/backup/Rfiles/sym_matrix_cos.csv", header=T)
> > > >
> > > >
> > > > o<-as.matrix(m[(select=c("X12002", "X12027", "X12054", "X12084", "X12085",
> "X12115", "X12129", "X12139", "X12195", "X12223", "X12295", "X12327", "X12356",
> "X12474", "X12487", "X12491", "X12520", "X12570", "X12600", "X12616", "X12626",
> "X12629", "X12634", "X12669", "X12685", "X12748", "X12759", "X12766", "X12789",
> "X12793", "X12814", "X12824", "X12892", "X12897", "X12909", "X12932", "X12959",
> "X12995", "X13018", "X13039", "X13134", "X13138", "X13162", "X13173", "X13236",
> "X13243", "X13351", "X13410", "X13452", "X13474", "X13475", "X13486", "X13518",
> "X13574", "X13586", "X13588"))])
> > > >
> > > > >
> > > >
> > > >
> > > > However, when I use the same code introducing the total number of columns
> (around 2000) it's not working.
> > > >
> > > >
> > > > I have checked all  labels several times in order not to commit mistakes. For this
> reason I have copied and pasted all labels from a database to a spreadsheet where I
> have added all quotes dragging them from the first cell to last one (not to miss
> quotes). Really I don't have any idea about the reason which R permits to apply this
> code taking 56 columns (as in example above) and doesn't permit to do it taking
> around 2000 columns. If you have any suggestions, please, let me know.
> > > >
> > > >
> > > > Thanks to everybody again.
> > > >
> > > >
> > > > Best,
> > > >
> > > >
> > > > AJ
> > > >
> > > >
> > > >
> > > > > Date: Sun, 15 Jul 2012 19:09:05 -0700
> > > > > From: smartpink111 at yahoo.com
> > > > > Subject: Re: [R] extracting rows and columns from a big matrix
> > > > > To: anxusgo at hotmail.com
> > > > > CC: r-help at r-project.org
> > > > >
> > > > > Hello,
> > > > >
> > > > > In my previous email, I used index to subset the data.  Then, I looked at your
> code.  I guess you wanted to try the "subset" function to get the same output.
> > > > >
> > > > > Try this:
> > > > > dat1<-read.table(text="
> > > > >   X1 X7 X12 X15 X22 X26 X31 X34 X39 X44 X51
> > > > > X1  1  2   3   4  5  6  7  8  9 10  11
> > > > > X7  11  9  7  5   3  1 10 8 6  4  2
> > > > > X12 3  4  7  8  5   7  2  9  1  3  2
> > > > > X15 9  9  8  4  7  1   1  3  2  5  3
> > > > > X22 6  7  7  4  4  2  9  8  8  1  1
> > > > > X26 3  9  4  8  5  7  6  1  2  3  8
> > > > > X31 1  2  1  3  1  4  1  5  1  6  1
> > > > > X34 6  7  8  5  2  9  5  1  6  8  9
> > > > > X39 4  8  7  4  6  5  1  9  2  7  5
> > > > > X44 2  2  2  8  6  7  9  5  3  7  7
> > > > > X51 9  9  9  6  6  4  8  7  2  1  3
> > > > > ",sep="", header=TRUE)
> > > > >
> > > > > subset(dat1,subset=row.names(dat1)%in%
> c("X1","X12","X22","X31"),select=c("X1","X12","X22","X31"))
> > > > >     X1 X12 X22 X31
> > > > > X1   1   3   5   7
> > > > > X12  3   7   5   2
> > > > > X22  6   7   4   9
> > > > > X31  1   1   1   1
> > > > >
> > > > > A.K.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > ----- Original Message -----
> > > > > From: A J <anxusgo at hotmail.com>
> > > > > To: jholtman at gmail.com
> > > > > Cc: r-help at r-project.org
> > > > > Sent: Sunday, July 15, 2012 3:43 PM
> > > > > Subject: Re: [R] extracting rows and columns from a big matrix
> > > > >
> > > > >
> > > > > Sorry so much for mistakes.
> > > > >
> > > > > It was an example code and I commited some mistakes typing it. But meaning
> the original code is right (I have checked several times) I am not sure about how to
> solve the problem of extracting columns and rows using labels from a squared matrix.
> I have enclosed a text file with the idea in order to understand it better.
> > > > >
> > > > > Thanks again, and sorry for the inconvenience.
> > > > >
> > > > > Best,
> > > > >
> > > > > AJ
> > > > >
> > > > >
> > > > >
> > > > > > Date: Sun, 15 Jul 2012 14:53:47 -0400
> > > > > > Subject: Re: [R] extracting rows and columns from a big matrix
> > > > > > From: jholtman at gmail.com
> > > > > > To: anxusgo at hotmail.com
> > > > > > CC: r-help at r-project.org
> > > > > >
> > > > > > For a start, you are missing a quote and a parenthese on the
> > > > > > statement; probably should be: (another quote was also missing)
> > > > > >
> > > > > > n<-subset(m, select=c("X1", "X7", "X12","X15", "X22", "X26", "X31",
> > > > > > "X34", "X39", "X44", "X51", "X58"))
> > > > > >
> > > > > > Not sure what you want with the rownames; an example would help and
> > > > > > post with 'dput'.
> > > > > >
> > > > > > On Sun, Jul 15, 2012 at 2:47 PM, A J <anxusgo at hotmail.com> wrote:
> > > > > > >
> > > > > > > Hi there and thanks in advance.
> > > > > > >
> > > > > > > I have a large symmetrical matrix stored in a text file. After load in R I
> would like to extract the same number of columns and rows (symmetrical submatrix)
> using their labels.
> > > > > > >
> > > > > > > I have tried this code in order to extract columns, but R console gives me
> the "+" sign at the end of the code, pointing out incomplete command, so it is not
> working:
> > > > > > >
> > > > > > > m<-read.table("C:/backup/symmetrical.csv")
> > > > > > >
> > > > > > > n<-subset(m, select=c("X1", "X7", "X12", X15", "X22", "X26", "X31", "X34",
> "X39", "X44", "x51", "X58)
> > > > > > >
> > > > > > > Therefore, I have no tried with row names yet.
> > > > > > >
> > > > > > > Any suggestions? Sorry for the inconvenience. I have read some
> information about this but always have the same problem with "+" and I do not have
> any idea to follow.
> > > > > > >
> > > > > > > Best,
> > > > > > >
> > > > > > > AJ
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >         [[alternative HTML version deleted]]
> > > > > > >
> > > > > > > ______________________________________________
> > > > > > > R-help at r-project.org mailing list
> > > > > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > > > > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > > > > > > and provide commented, minimal, self-contained, reproducible code.
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Jim Holtman
> > > > > > Data Munger Guru
> > > > > >
> > > > > > What is the problem that you are trying to solve?
> > > > > > Tell me what you want to do, not how you want to do it.
> > > > >
> > > > > ______________________________________________
> > > > > R-help at r-project.org mailing list
> > > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > > > > and provide commented, minimal, self-contained, reproducible code.
> > > > >
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list