

tsboot(boot)                                 R Documentation

_B_o_o_t_s_t_r_a_p_p_i_n_g _o_f _T_i_m_e _S_e_r_i_e_s

_D_e_s_c_r_i_p_t_i_o_n_:

     Generate `R' bootstrap replicates of a statistic
     applied to a time series.  The replicate time series
     can be generated using fixed or random block lengths or
     can be model based replicates.

_U_s_a_g_e_:

     tsboot(tseries, statistic, R, sim="model", l=NULL, endcorr=T,
            n.sim=length(tseries), orig.t=T, ran.gen=NULL,
            ran.args=NULL, norm=T, ...)

_A_r_g_u_m_e_n_t_s_:

 tseries: A univariate or multivariate time series.  The
          time series can be formed by the functions `rts'
          or `cts' (S-Plus version 3.2 and later) or the
          earlier `ts' function. Irregular time series, with
          class `"its"', may not be used.

statistic: A function which when applied to `tseries'
          returns a vector containing the statistic(s) of
          interest.  Each time `statistic' is called it is
          passed a time series of length `n.sim' which is of
          the same class as the original `tseries'.  Any
          other arguments which `statistic' takes must
          remain constant for each bootstrap replicate and
          should be supplied through the ...{} argument to
          `tsboot'.

       R: A positive integer giving the number of bootstrap
          replicates required.

     sim: The type of simulation required to generate the
          replicate time series.  The possible input values
          are `"model"' (model based resampling), `"fixed"'
          (block resampling with fixed block lengths of
          `l'), `"geom"' (block resampling with block
          lengths having a geometric distribution with mean
          `l') or `"scramble"' (phase scrambling).

       l: If `sim' is `"fixed"' then `l' is the fixed block
          length used in generating the replicate time
          series.  If `sim' is `"geom"' then `l' is the mean
          of the geometric distribution used to generate the
          block lengths. `l' should be a positive integer
          less than the length of `tseries'.  This argument
          is not required when `sim' is `"model"' but it is
          required for all other simulation types.

 endcorr: A logical variable indicating whether end correc-
          tions are to be applied when `sim' is `"fixed"'.
          When `sim' is `"geom"', `endcorr' is automatically
          set to `TRUE'; `endcorr' is not used when `sim' is
          `"model"' or `"scramble"'.

   n.sim: The length of the simulated time series.  Typi-
          cally this will be equal to the length of the
          original time series but there are situations when
          it will be larger.  One obvious situation is if
          prediction is required.  Another situation in
          which `n.sim' is larger than the original length
          is if `tseries' is a residual time series from
          fitting some model to the original time series.
          In this case, `n.sim' would usually be the length
          of the original time series.

  orig.t: A logical variable which indicates whether
          `statistic' should be applied to `tseries' itself
          as well as the bootstrap replicate series.  If
          `statistic' is expecting a longer time series than
          `tseries' or if applying `statistic' to `tseries'
          will not yield any useful information then
          `orig.t' should be set to `FALSE'.

 ran.gen: This is a function of three arguments.  The first
          argument is a time series.  If `sim' is "model"
          then it will always be `tseries' that is passed.
          For other simulation types it is the result of
          selecting `n.sim' observations from `tseries' by
          some scheme and converting the result back into a
          time series of the same form as `tseries'
          (although of length `n.sim').  The second argument
          to `ran.gen' is always the value `n.sim', and the
          third argument is `ran.args', which is used to
          supply any other objects needed by `ran.gen'.  If
          `sim' is `"model"' then the generation of the
          replicate time series will be done in `ran.gen'
          (for example through use of `arima.sim').  For the
          other simulation types `ran.gen' is used for
          "post-blackening".  The default is that the func-
          tion simply returns the time series passed to it.

ran.args: This will be supplied to `ran.gen' each time it is
          called.  If `ran.gen' needs any extra arguments
          then they should be supplied as components of
          `ran.args'.  Multiple arguments may be passed by
          making `ran.args' a list.  If `ran.args' is `NULL'
          then it should not be used within `ran.gen' but
          note that `ran.gen' must still have its third
          argument.

    norm: A logical argument indicating whether normal mar-
          gins should be used for phase scrambling.  If
          `norm' is `FALSE' then margins corresponding to
          the exact empirical margins are used.

     ...: Any extra arguments to `statistic' may be supplied
          here.

_D_e_t_a_i_l_s_:

     If `sim' is `"fixed"' then each replicate time series
     is found by taking blocks of length `l', from the orig-
     inal time series and putting them end-to-end until a
     new series of length `n.sim' is created.  When `sim' is
     `"geom"' a similar approach is taken except that now
     the block lengths are generated from a geometric dis-
     tribution with mean `l'.   Post-blackening can be car-
     ried out on these replicate time series by including
     the function `ran.gen' in the call to `tsboot' and hav-
     ing `tseries' as a time series of residuals.

     Model based resampling is very similar to the paramet-
     ric bootstrap and all simulation must be in one of the
     user specified functions.  This avoids the complicated
     problem of choosing the block length but relies on an
     accurate model choice being made.

     Phase scrambling is described in Section 8.2.4 of Davi-
     son and Hinkley (1997).  The types of statistic for
     which this method produces reasonable results is very
     limited and the other methods seem to do better in most
     situations.  Other types of resampling in the frequency
     domain can be accomplished using the function `boot'
     with the argument `sim="parametric"'.

_V_a_l_u_e_:

     An object of class `"boot"' with the following compo-
     nents.

      t0: If `orig.t' is `TRUE' then `t0' is the result of
          `statistic(tseries,...{})' otherwise it is `NULL'.

       t: The results of applying `statistic' to the repli-
          cate time series.

       R: The value of `R' as supplied to `tsboot'.

 tseries: The original time series.

statistic: The function `statistic' as supplied.

     sim: The simulation type used in generating the repli-
          cates.

 endcorr: The value of `endcorr' used.  The value is mean-
          ingful only when `sim' is `"fixed"'; it is ignored
          for model based simulation or phase scrambling and
          is always set to `TRUE' if `sim' is `"geom"'.

   n.sim: The value of `n.sim' used.

       l: The value of `l' used for block based resampling.
          This will be `NULL' if block based resampling was
          not used.

 ran.gen: The `ran.gen' function used for generating the
          series or for "post-blackening".

ran.args: The extra arguments passed to `ran.gen'.

    call: The original call to `tsboot'.

_R_e_f_e_r_e_n_c_e_s_:

     Davison, A.C. and Hinkley, D.V. (1997) Bootstrap Meth-
     ods and Their Application. Cambridge University Press.

     Kunsch, H.R. (1989) The jackknife and the bootstrap for
     general stationary observations. Annals of Statistics,
     17, 1217-1241.

     Politis, D.N. and Romano, J.P. (1994) The stationary
     bootstrap.  Journal of the American Statistical Associ-
     ation, 89, 1303-1313.

_S_e_e _A_l_s_o_:

     `boot', `arima.sim'

_E_x_a_m_p_l_e_s_:

     ## no ar function in R
     data(lynx)
     lynx.fun <- function(tsb)
     {    ar.fit <- ar(tsb, order.max=25)
          c(ar.fit$order, mean(tsb), tsb)
     }

     # the stationary bootstrap with mean block length 20
     lynx.1 <- tsboot(log(lynx), lynx.fun, R=99, l=20, sim="geom")

     # the fixed block bootstrap with length 20
     lynx.2 <- tsboot(log(lynx), lynx.fun, R=99, l=20, sim="fixed")

     # Now for model based resampling we need the original model
     # Note that for all of the bootstraps which use the residuals as their
     # data, we set orig.t to F since the function applied to the residual
     # time series will be meaningless.
     lynx.ar <- ar(log(lynx))
     lynx.model <- list(order=c(lynx.ar$order,0,0),ar=lynx.ar$ar)
     lynx.res <- lynx.ar$resid[!is.na(lynx.ar$resid)]
     lynx.res <- lynx.res - mean(lynx.res)

     lynx.sim <- function(res,n.sim, ran.args) {
     # random generation of replicate series using arima.sim
          rg1 <- function(n, res)
               sample(res, n, replace=T)
          ts.orig <- ran.args$ts
          ts.mod <- ran.args$model
          mean(ts.orig)+ts(arima.sim(model=ts.mod, n=n.sim,
                           rand.gen=rg1, res=as.vector(res)))
     }

     lynx.3 <- tsboot(lynx.res, lynx.fun, R=99, sim="model", n.sim=114,
                      orig.t=F, ran.gen=lynx.sim,
                      ran.args=list(ts=log(lynx), model=lynx.model))

     #  For "post-blackening" we need to define another function
     lynx.black <- function(res, n.sim, ran.args)
     {    ts.orig <- ran.args$ts
          ts.mod <- ran.args$model
          mean(ts.orig) + rts(arima.sim(model=ts.mod,n=n.sim,innov=res))
     }

     # Now we can run apply the two types of block resampling again but this
     # time applying post-blackening.
     lynx.1b <- tsboot(lynx.res, lynx.fun, R=99, l=20, sim="fixed",
                       n.sim=114, orig.t=F, ran.gen=lynx.black,
                       ran.args=list(ts=log(lynx), model=lynx.model))

     lynx.2b <- tsboot(lynx.res, lynx.fun, R=99, l=20, sim="geom",
                       n.sim=114, orig.t=F, ran.gen=lynx.black,
                       ran.args=list(ts=log(lynx), model=lynx.model))

     # To compare the observed order of the bootstrap replicates we
     # proceed as follows.
     table(lynx.1$t[,1])
     table(lynx.1b$t[,1])
     table(lynx.2$t[,1])
     table(lynx.2b$t[,1])
     table(lynx.3$t[,1])
     # Notice that the post-blackened and model-based bootstraps preserve
     # the true order of the model (11) in many more cases than the others.

