[R] Adjusting length of series
David Winsemius
dwinsemius at comcast.net
Mon Jul 2 16:26:50 CEST 2012
On Jul 2, 2012, at 5:13 AM, Lekgatlhamang, lexi Setlhare wrote:
> Hi David and AK,
> I have been trying to implement your suggestions since yesterday,
> but I encountered some challenges.
>
> As for David's suggestions, I could only implement it after some
> modifications. Using an abridged version of my data, I dpud my
> dataset and then show my steps below.
Well, your initial question (why the $ referencing did not work) is
now answered. This is not a dataframe but rather a 'ts' classed object
and there is no `$` method for such objects. They are really matrices
with some extra attributes.
> ydata$BoBCL1
Error in ydata$BoBCL1 : $ operator is invalid for atomic vectors
As I understood it you were able to get useful analyses using the
formula methods for lm on these objects, but were just having
difficulty with the "$" operator. So the answer is ..... don't do that.
--
David.
>
>> dput(ydata)
> structure(c(68.1000000000004, -34.8000000000002, 90.3999999999996,
> 54.6000000000004, -172.3, 51.8000000000002, 175, 79.8000000000002,
> -35.7000000000007, 130.5, 116.8, -67.5, 164.5, 514.8, -326.1,
> 98.4000000000005, 160.2, 53.1999999999998, 283.6, -111.6, 127.8,
> -17.3000000000002, 286.3, NA, NA, -102.900000000001, 125.2,
> -35.7999999999993,
> -226.900000000001, 224.1, 123.2, -95.1999999999998, -115.500000000001,
> 166.200000000001, -13.6999999999998, -184.3, 232, 350.3,
> -840.900000000001,
> 424.500000000001, 61.7999999999993, -107, 230.400000000001,
> -395.200000000001,
> 239.400000000001, -145.1, 303.6, NA, NA, NA, 228.1, -160.999999999999,
> -191.100000000001, 451.000000000001, -100.900000000001, -218.4,
> -20.3000000000011, 281.700000000002, -179.900000000001, -170.6,
> 416.3, 118.3, -1191.2, 1265.4, -362.700000000002, -168.799999999999,
> 337.400000000001, -625.600000000001, 634.600000000001,
> -384.500000000001,
> 448.700000000001, NA, NA, -164.457840999999, 17.0793539999995,
> 95.9767880000009, 680.238166999999, -491.348690999999, -274.694009,
> -256.332907, 469.62296, -146.431891, -41.0772019999995, -106.970104,
> 757.688263999999, -1689.214533, 2320.098952, -1446.97942, 516.384521,
> -375.277650999999, 293.867029999999, 417.845195, 278.198807,
> -968.592033999999, -314.195986, NA, NA, NA, 181.537194999999,
> 78.8974340000013, 584.261378999998, -1171.586858, 216.654681999999,
> 18.3611019999998, 725.955867, -616.054851, 105.354689000001,
> -65.8929020000005, 864.658367999999, -2446.902797, 4009.313485,
> -3767.078372, 1963.363941, -891.662171999999, 669.144680999999,
> 123.978165, -139.646388, -1246.790841, 654.396048, NA, 4937,
> 5005.1, 4970.3, 5060.7, 5115.3, 4943, 4994.8, 5169.8, 5249.6,
> 5213.9, 5344.4, 5461.2, 5393.7, 5558.2, 6073, 5746.9, 5845.3,
> 6005.5, 6058.7, 6342.3, 6230.7, 6358.5, 6341.2, 6627.5, 4187.5,
> 4296.004835, 4240.051829, 4201.178177, 4258.281313, 4995.622616,
> 5241.615228, 5212.913831, 4927.879527, 5112.468183, 5150.624948,
> 5147.704511, 5037.81397, 5685.611693, 4644.194883, 5922.877025,
> 5754.579747, 6102.66699, 6075.476582, 6342.153204, 7026.675021,
> 7989.395645, 7983.524235, 7663.456839), .Dim = c(24L, 7L), .Dimnames
> = list(
> NULL, c("DCred1", "DCred2", "DCred3", "DBoBC2", "DBoBC3",
> "CredL1", "BoBCL1")), .Tsp = c(2001.08333333333, 2003, 12
> ), class = c("mts", "ts"))
>
> NB: the NAs in the dataset emanated from lagging or differencing the
> series
>
> David's suggestion
> df<-data.frame(DCred1,DCred2,DCred3,DBoBC2,DBoBC3,CredL1,BoBCL1)
> Error in data.frame(DCred1, DCred2, DCred3, DBoBC2, DBoBC3, CredL1,
> BoBCL1) :
> arguments imply differing number of rows: 23, 22, 21, 24
>
> So I modified as follows:
> length(DCred3) # finding the minimum length of various series
> [1] 21
>
> # Then dataframe construction
> dframe<-
> data.frame(Dcre1=DCred1[1:21],Dcre2=DCred2[1:21],Dcre3=DCred3[1:21],
> +
> Dbobc2
> =
> DBoBC2
> [1:21],Dbobc3=DBoBC3[1:21],CredL=CredL1[1:21],BoBCL=BoBCL1[1:21])
> # Then estimated regression
>> regCred<- lm(Dcre1~Dcre2+Dcre3+Dbobc2+Dbobc3+CredL+BoBCL,
>> data=dframe)
>> summary(regCred)
> # Worked well as shown by results below
> Call:
> lm(formula = Dcre1 ~ Dcre2 + Dcre3 + Dbobc2 + Dbobc3 + CredL +
> BoBCL, data = dframe)
> Residuals:
> Min 1Q Median 3Q Max
> -69.516 -27.695 -8.085 13.851 107.276
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 159.32304 157.15209 1.014 0.327873
> Dcre2 -0.75527 0.17262 -4.375 0.000634 ***
> Dcre3 -0.21006 0.08656 -2.427 0.029329 *
> Dbobc2 0.05111 0.06565 0.779 0.449197
> Dbobc3 0.03106 0.03510 0.885 0.391108
> CredL -0.10967 0.04933 -2.223 0.043177 *
> BoBCL 0.09756 0.03097 3.150 0.007087 **
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> Residual standard error: 52.3 on 14 degrees of freedom
> Multiple R-squared: 0.9331, Adjusted R-squared: 0.9044
> F-statistic: 32.55 on 6 and 14 DF, p-value: 1.911e-07
>
> This is good, but couldn't I code the process for my 15 variable
> model?
> Perhaps that is where the use of
> Dcr<- lapply(..., function(x) ...)
> comes in?
>
> AK, if you spare some minutes, please use my dput data to illustrate
> the suggestion you made, I searched the lapply function (using ??
> lapply) but could not get a handle of how to use it in my case. My
> dput data is as shown below.
>
> DCred1 DCred2 DCred3 DBoBC2 DBoBC3 CredL1 BoBCL1
> Feb 2001 68.1 NA NA NA NA 4937.0 4187.500
> Mar 2001 -34.8 -102.9 NA -164.45784 NA 5005.1 4296.005
> Apr 2001 90.4 125.2 228.1 17.07935 181.53719 4970.3 4240.052
> May 2001 54.6 -35.8 -161.0 95.97679 78.89743 5060.7 4201.178
> Jun 2001 -172.3 -226.9 -191.1 680.23817 584.26138 5115.3 4258.281
> Jul 2001 51.8 224.1 451.0 -491.34869 -1171.58686 4943.0 4995.623
> Aug 2001 175.0 123.2 -100.9 -274.69401 216.65468 4994.8 5241.615
> Sep 2001 79.8 -95.2 -218.4 -256.33291 18.36110 5169.8 5212.914
> Oct 2001 -35.7 -115.5 -20.3 469.62296 725.95587 5249.6 4927.880
> Nov 2001 130.5 166.2 281.7 -146.43189 -616.05485 5213.9 5112.468
> Dec 2001 116.8 -13.7 -179.9 -41.07720 105.35469 5344.4 5150.625
> Jan 2002 -67.5 -184.3 -170.6 -106.97010 -65.89290 5461.2 5147.705
> Feb 2002 164.5 232.0 416.3 757.68826 864.65837 5393.7 5037.814
> Mar 2002 514.8 350.3 118.3 -1689.21453 -2446.90280 5558.2 5685.612
> Apr 2002 -326.1 -840.9 -1191.2 2320.09895 4009.31348 6073.0 4644.195
> May 2002 98.4 424.5 1265.4 -1446.97942 -3767.07837 5746.9 5922.877
> Jun 2002 160.2 61.8 -362.7 516.38452 1963.36394 5845.3 5754.580
> Jul 2002 53.2 -107.0 -168.8 -375.27765 -891.66217 6005.5 6102.667
> Aug 2002 283.6 230.4 337.4 293.86703 669.14468 6058.7 6075.477
> Sep 2002 -111.6 -395.2 -625.6 417.84519 123.97817 6342.3 6342.153
> Oct 2002 127.8 239.4 634.6 278.19881 -139.64639 6230.7 7026.675
> Nov 2002 -17.3 -145.1 -384.5 -968.59203 -1246.79084 6358.5 7989.396
> Dec 2002 286.3 303.6 448.7 -314.19599 654.39605 6341.2 7983.524
> Jan 2003 NA NA NA NA NA 6627.5 7663.457
>
> Thanks kindly. Lexi
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list