[R] NADA Data Frame Format: Wide or Long?
Rich Shepard
rshepard at appl-ecosys.com
Tue Jul 3 18:57:30 CEST 2012
I have water chemistry data with censored values (i.e., those less than
reporting levels) in a data frame with a narrow (i.e., database table)
format. The structure is:
$ site : Factor w/ 64 levels "D-1","D-2","D-3",..: 1 1 1 1 1 1 1 1 ...
$ sampdate: Date, format: "2007-12-12" "2007-12-12" ...
$ preeq0 : logi TRUE TRUE TRUE TRUE TRUE TRUE ...
$ param : Factor w/ 37 levels "Ag","Al","Alk_tot",..: 1 2 8 17 3 4 9 ...
$ quant : num 0.005 0.106 1 231 231 0.011 0.001 0.002 0.001 100 ...
$ ceneq1 : logi TRUE FALSE TRUE FALSE FALSE FALSE ...
$ floor : num 0 0.106 0 231 231 0.011 0 0 0 100 ...
$ ceiling : num 0.005 0.106 1 231 231 0.011 0.001 0.002 0.001 100 ...
The logical 'preeq0' separates sampdate into two groups; 'ceneq1'
indicates censored/uncensored values; 'floor' and 'ceiling' are the minima
and maxima for censored values.
The NADA package methods will be used, but I have not found information on
whether this format or the wide (i.e., spreadsheet) format should be used.
The NADA.pdf document doesn't tell me; at least, I haven't found the answer
there. I can apply reshape2 to melt and re-cast the data in wide format if
that's what is appropriate. Please provide a pointer to documents I can read
for an answer to this and related questions.
Rich
More information about the R-help
mailing list