data                  package:base                  R Documentation

_D_a_t_a _S_e_t_s

_D_e_s_c_r_i_p_t_i_o_n:

     `data' loads a data set or lists (via `show.data') the available
     data sets.

_U_s_a_g_e:

     data(..., list = character(0), package = .packages(),
          lib.loc = .lib.loc)
     show.data(package = .packages(), lib.loc = .lib.loc)

_A_r_g_u_m_e_n_t_s:

     ...: a sequence of names or character strings.

    list: a character vector.

 package: a name or character vector giving the packages to look into
          for data sets.  By default, all packages in the search path
          are used, then the `data' directory (if present) of the
          current working directory.

 lib.loc: a character vector of directory names of R libraries. 
          Defaults to all libraries currently known.  If the default is
          used, the loaded packages are searched before the libraries.

_D_e_t_a_i_l_s:

     Currently, four formats of data files are supported:

        1.  files ending `.RData' or `.rda' are `load()'ed.

        2.  files ending `.R' or `.r' are `source()'d in, with the R
           working directory changed temporarily to the directory
           containing the respective file.

        3.  files ending `.tab' or `.txt' are read using
           `read.table(..., header = TRUE)', and hence result in a data
           frame.

        4.  files ending `.csv' are read using `read.table(..., header
           = TRUE, sep = ";")', and also result in a data frame.

     The data sets to be loaded can be specified as a sequence of names
     or character strings, or as the character vector `list', or as
     both. If no data sets are specified or `show.data' is called
     directly, the available data sets are displayed.

     If no data sets are specified, `data' calls `show.data'.
     `show.data' looks for a file `00Index' in a `data' directory of
     each specified package, and uses these files to prepare a listing.
      If there is a `data' area but no index a warning is given: such
     packages are incomplete.

     If `lib.loc' is not specified,  the datasets are searched for
     amongst those packages already loaded, followed by the `data'
     directory (if any) of the current working directory and then
     packages in the specified libraries.  If `lib.loc' is specified,
     packages are searched for in the specified libraries, even if they
     are already loaded from another library.

     To just look in the `data' directory of the current working
     directory, set `package = NULL'.

_V_a_l_u_e:

     `data()' returns a character vector of all data sets specified, an
     empty character vector if none were specified.

_N_o_t_e:

     The data files can be many small files.  On some file systems it
     is desirable to save space, and the files in the `data' directory
     of an installed package can be zipped up as a zip archive
     `Rdata.zip'.  You will need to provide a single-column file 
     `filelist' of file names in that directory.

     One can take advantage of the search order and the fact that a
     `.R' file will change directory.  If raw data are stored in
     `mydata.txt' then one can set up `mydata.R' to read `mydata.txt'
     and pre-process it, e.g. using `transform'. For instance one can
     convert numeric vectors to factors with the appropriate labels.
     Thus, the `.R' file can effectively contain a metadata
     specification for the plaintext formats.

_S_e_e _A_l_s_o:

     `help' for obtaining documentation on data sets.

_E_x_a_m_p_l_e_s:

     data()                       # list all available data sets
     data(package = base)         # list the data sets in the base package
     data(USArrests, "VADeaths")  # load the data sets `USArrests' and `VADeaths'
     help(USArrests)              # give information on data set `USArrests'

