[R] Find missing days

Gabor Grothendieck ggrothendieck at gmail.com
Wed Jan 2 09:41:01 CET 2008


If the missing days are any days in the range of the overall data
and not just within the range of the data within a level then just
calculate the overall range once outside of the by statement and
use that in the setdiff.

On Jan 2, 2008 3:07 AM,  <Lauri.Nikkinen at veripalvelu.fi> wrote:
> Thanks Gabor. Neat, it creates a df but this in not the solution I'm
> looking for. This is hard to explain but I'll give it a try. See, in my
> original df, in lev1, there is only 20 days
>
> > range(df$date1[df$lev == "lev1"])
> [1] "2007-09-01" "2007-09-23"
>
> and we all know that in September there is 30 days. So, the missing days
> in level df$lev1 are
>
> >    lev      date1
> > 1  lev1 2007-09-03
> > 2  lev1 2007-09-04
> > 4  lev1 2007-09-24
> > 5  lev1 2007-09-25
> > 6  lev1 2007-09-26
> > 7  lev1 2007-09-27
> > 8  lev1 2007-09-28
> > 9  lev1 2007-09-29
> > 10 lev1 2007-09-30
>
> And corresponding missing values in level lev2 are
>
>
> > 11 lev2 2007-09-01
> > 12 lev2 2007-09-02
> > 13 lev2 2007-09-03
> > 14 lev2 2007-09-04
> > 15 lev2 2007-09-05
> > 16 lev2 2007-09-06
> > 17 lev2 2007-09-07
> > 18 lev2 2007-09-08
> > 19 lev2 2007-09-09
> > 20 lev2 2007-09-10
> > 21 lev2 2007-09-11
> > 22 lev2 2007-09-12
> > 23 lev2 2007-09-13
> > 24 lev2 2007-09-14
> > 25 lev2 2007-09-15
> > 26 lev2 2007-09-16
> > 27 lev2 2007-09-17
> > 28 lev2 2007-09-18
> > 29 lev2 2007-09-19
> > 30 lev2 2007-09-20
> > 31 lev2 2007-09-21
> > 32 lev2 2007-09-22
> > 33 lev2 2007-09-23
> > 34 lev2 2007-10-17
> > 35 lev2 2007-10-18
> > 36 lev2 2007-10-19
> > 37 lev2 2007-10-20
> > 38 lev2 2007-10-21
> > 39 lev2 2007-10-22
> > 20 lev2 2007-10-23
> > 41 lev2 2007-10-24
> > 42 lev2 2007-10-25
> > 43 lev2 2007-10-26
> > 44 lev2 2007-10-27
> > 45 lev2 2007-10-28
> > 46 lev2 2007-10-29
> > 47 lev2 2007-10-20
> > 48 lev2 2007-10-31
>
> because there is also days from October there.
>
> Hope this makes my problem clear.
>
> -Lauri
>
> -----Original Message-----
> From: Gabor Grothendieck [mailto:ggrothendieck at gmail.com]
>
> Sent: 2. tammikuuta 2008 9:53
> To: Nikkinen Lauri
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] Find missing days
>
> Then just place it in a by like this.  The z<- line is from prior post:
>
> do.call(rbind, by(df, df$lev, function(df) {
>  z <- structure(setdiff(do.call(seq, as.list(range(df$date1))),
> df$date1),
>   class = "Date")
>  if (length(z) > 0) data.frame(level = df$lev[1], date = z)
> }))
>
>
> On Jan 2, 2008 2:38 AM,  <Lauri.Nikkinen at veripalvelu.fi> wrote:
> > Thanks Gabor for your reply. Your script will find missing days in
> date1
> > range but my intention was to find missing days in each level (df$lev)
> > in corresponding month. The resulting df could look like this:
> >
> >    lev      date1
> > 1  lev1 2007-09-03
> > 2  lev1 2007-09-04
> > 4  lev1 2007-09-24
> > 5  lev1 2007-09-25
> > 6  lev1 2007-09-26
> > 7  lev1 2007-09-27
> > 8  lev1 2007-09-28
> > 9  lev1 2007-09-29
> > 10 lev1 2007-09-30
> > 11 lev2 2007-09-01
> > 12 lev2 2007-09-02
> > 13 lev2 2007-09-03
> > 14 lev2 2007-09-04
> > 15 lev2 2007-09-05
> > 16 lev2 2007-09-06
> > 17 lev2 2007-09-07
> > 18 lev2 2007-09-08
> > 19 lev2 2007-09-09
> > 20 lev2 2007-09-10
> > 21 lev2 2007-09-11
> > 22 lev2 2007-09-12
> > 23 lev2 2007-09-13
> > 24 lev2 2007-09-14
> > 25 lev2 2007-09-15
> > 26 lev2 2007-09-16
> > 27 lev2 2007-09-17
> > 28 lev2 2007-09-18
> > 29 lev2 2007-09-19
> > 30 lev2 2007-09-20
> > 31 lev2 2007-09-21
> > 32 lev2 2007-09-22
> > 33 lev2 2007-09-23
> > 34 lev2 2007-10-17
> > 35 lev2 2007-10-18
> > 36 lev2 2007-10-19
> > 37 lev2 2007-10-20
> > 38 lev2 2007-10-21
> > 39 lev2 2007-10-22
> > 20 lev2 2007-10-23
> > 41 lev2 2007-10-24
> > 42 lev2 2007-10-25
> > 43 lev2 2007-10-26
> > 44 lev2 2007-10-27
> > 45 lev2 2007-10-28
> > 46 lev2 2007-10-29
> > 47 lev2 2007-10-20
> > 48 lev2 2007-10-31
> > 49 lev3 2007-10-01
> > etc.
> >
> > Thanks again,
> > Lauri
> >
> >
> > -----Original Message-----
> > From: Gabor Grothendieck [mailto:ggrothendieck at gmail.com]
> > Sent: 2. tammikuuta 2008 9:22
> > To: Nikkinen Lauri
> > Cc: r-help at stat.math.ethz.ch
> > Subject: Re: [R] Find missing days
> >
> > Try this.  It creates a sequence of dates from the range of df$date1
> and
> > then does a setdiff between that and the original dates.  The result
> is
> > numeric so we create a Date structure out of it.
> >
> > structure(setdiff(do.call(seq, as.list(range(df$date1))), df$date1),
> > class = "Date")
> >
> >
> > On Jan 2, 2008 1:55 AM,  <Lauri.Nikkinen at veripalvelu.fi> wrote:
> > > Hi,
> > >
> > > I have a data.frame like this:
> > >
> > > y <- rnorm(60)
> > > lev <- gl(3,20, labels=paste("lev", 1:3, sep=""))
> > > date1 <- as.Date(seq(ISOdate(2007,9,1), ISOdate(2007,11,5),
> > > by=60*60*24))
> > > date1 <- date1[-c(3,4,15,34,38,40)]
> > > df <- data.frame(lev=lev, date1=date1, y=y)
> > >
> > > I would like to produce a new data.frame with missing days in
> df$date1
> > > in each df$lev, like this:
> > >
> > >    lev      date1
> > > 1  lev1 2007-09-03
> > > 2  lev1 2007-09-04
> > > 3  lev1 2007-09-15
> > > 4  lev2 2007-09-01
> > > 5  lev2 2007-09-02
> > > etc.
> > >
> > > How can I do this?
> > >
> > > Thanks,
> > > Lauri
> > > FRCBS
> > >
> > >
> > >
> > >        [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
>




More information about the R-help mailing list