[R] selecting a subset of files to be processed
(Ted Harding)
Ted.Harding at wlandres.net
Sat Jul 28 20:32:29 CEST 2012
And, in addition to the tip from Rui (and similar from Joshua) below,
I would advise that there is one good reason not to try doing it
in "pure Linux".
The only source (that I know of) in Linux itself for random numbers
can be tapped by something like
cat /dev/random > filename
/dev/random stores noise generated by the timings of system events
(keyboard presses, mouse-clicks, disk accesses, interrupts, etc.)
after subjecting them to a high-entropy stirring process. See:
man random
It yields them in the form of random bytes (each of 8 random 0/1 bits)
and you would have to devise some means of coverting those onto a
form suitable for accessing a directory listing at random. Not a
pretty task!
There is also the command 'rand' available in the openSSL toolkit,
but that still outputs the results in the same format as /dev/random.
If you really want to do this outside R, the I would suggest writing
a little C program (to be run from the Linux command line). C can
do its own random number generation, with results returned as
real (double), and then apply these to select at random from the
contents of a file generated by something like
ls filesdir > filelist.txt
and output the random selection.
Ted.
On 28-Jul-2012 18:00:38 Rui Barradas wrote:
> Hello,
>
> If the files are to be processed in R select a random sample in R.
> Using list.files() you can assign a character vector with the filenames
> of interest and then sample from that vector.
>
> ?list.files
> filenames <- list.files(path, pattern)
>
> rand.sampl <- sample(filenames, 45)
>
> Hope this helps,
>
> Rui Barradas
>
> Em 28-07-2012 18:49, Erin Hodgess escreveu:
>> Dear R People:
>>
>> I am using a Linux system in which I have about 3000 files.
>>
>> I would like to randomly select about 45 of those files to be processed in
>> R.
>>
>> Could I make the selection in R or should I do it in Linux, please?
>>
>> This is with R-2.15.1.
>>
>> Thanks,
>> erin
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
Date: 28-Jul-2012 Time: 19:32:26
This message was sent by XFMail
More information about the R-help
mailing list