[Bioc-sig-seq] How to generate random nucleotide sequences?
Purnachander
purna at atc.tcs.com
Tue Sep 7 07:24:04 CEST 2010
Hello All,
I generated random nucleotide sequences having almost equal
trinucleotide frequencies to a query sequence, using "sample" function
in the following way:
seq1<-paste(sample(alpha,333,replace=TRUE,prob=freq),collapse=""); where
"alpha" is a vector of 64 trinucleotides possible from the set
c("A","G","C"."T") and *"freq" is a frequency vector of 64
trinucleotides present in a given query sequence*.
Let's consider a random sequence generated in above described way. Does
the random sequence preserve the mon- and di- nucleotide frequencies of
the query sequence? I mean, do the mono and di nucleotide frequencies of
random sequence are similar to mono and di nucleotide frequencies of
query sequence?
In one of the cases I worked with, the answer was "No" to the above
question. If that is the case, How to generate a random sequence
preserving a mono-, di- and tri- nucleotide frequencies of the query
sequence?
Regards,
Purnachander G
More information about the Bioc-sig-sequencing
mailing list