[R] Proper Usage of the XREG in ARIMA

Idgarad idgarad at gmail.com
Thu Jan 17 19:52:07 CET 2008


I am using the auto.arima package to do some basic forecasting based
on CPU usage. I now have found a calendar that has various activities
that partially control the computer's usage and want to factor that in
(They are effectively dummy variables indicating a particular type of
activity that week). Per the ARIMA instructions I am to feed those in
a a vector or matrix. I am getting lost in the sand so to speak at
this point. How would I prepare that data? I am pulling from a CSV
that is roughly:

date,usage,allocation,number of engines, theoretical max,r1,r2,...r21

So far so good just working with a copy of the CSV that is just

date,usage

But what should I do to disect the configuration data and the r1 to
r21 dummy variables? (Some of these explain certain spikes and level
shifts, forinstance r21 indicates if there was conversion activity
during the week). I never really could figure out in R (only been
using it a week or so) how to pull out part of an array.

Also should I do my disection prior to or after concerting it into a ts object?

the short of the script is (removing plots etc..):
----------
baseU000 <- read.csv("testfile.csv",header=T)
#--- hmm what happens in years with a 53rd week...
tsbaseU000 <- ts(baseU000,start=2004,frequency=52)
#--- add regressors
arimafit <- auto.arima(tsU000[,2],approximation=T,stepwise=N)
forecastU000 <- forecast(arimafit,52)

plot(forecastU000)
lines(fitted(arimafit),col=3,lty="dashed")
----------


What I am just trying to do is build the best educated guess on what
the cpu usage is going to be for some planning. As I control part of
the calendar I need to start working towards the ability to do some
"What-If" so I can provide future values for those dummy variables
also. Soo close yet so far away.... Any suggestions?




More information about the R-help mailing list