

grep {base}                                  R Documentation

_P_a_t_t_e_r_n _M_a_t_c_h_i_n_g _a_n_d _R_e_p_l_a_c_e_m_e_n_t

_D_e_s_c_r_i_p_t_i_o_n_:

     `grep' searches for matches to `pattern' (its first
     argument) within the vector `x' of character strings
     (second argument). `regexpr' does too, but returns more
     detail in a different format.

     `sub' and `gsub' perform replacement of matches deter-
     mined by regular expression matching.

_U_s_a_g_e_:

     grep(pattern, x, ignore.case=FALSE, extended=TRUE, value=FALSE)
     sub(pattern, replacement, x,
             ignore.case=FALSE, extended=TRUE)
     gsub(pattern, replacement, x,
             ignore.case=FALSE, extended=TRUE)
     regexpr(pattern, text,  extended=TRUE)

_A_r_g_u_m_e_n_t_s_:

 pattern: character string containing a regular expression
          to be matched in the vector of character string
          `vec'.

 x, text: a vector of character strings where matches are
          sought.

ignore.case: if `FALSE', the pattern matching is case sensi-
          tive and if `TRUE', case is ignored during match-
          ing.

extended: if `TRUE', extended regular expression matching is
          used, and if `FALSE' basic regular expressions are
          used.

   value: if `FALSE', a vector containing the (`integer')
          indices of the matches determined by `grep' is
          returned, and if `TRUE', a vector containing the
          matching elements themselves is returned.

replacement: a replacement for matched pattern in `sub' and
          `gsub'.

_D_e_t_a_i_l_s_:

     The two `*sub' functions differ only in that `sub'
     replaces only the first occurrence of a `pattern'
     whereas `gsub' replaces all occurrences.

     The regular expressions used are those specified by
     POSIX 1003.2, either extended or basic, depending on
     the value of the `extended' argument.

_V_a_l_u_e_:

     For `gsub' a vector giving either the indices of the
     elements of `x' that yielded a match or, if `value' is
     `TRUE', the matched elements.

     For `sub' and `gsub' a character vector of the same
     length as the original.

     For `regexpr' an integer vector of the same length as
     `text' giving the starting position of the first match,
     or -1 if there is none, with attribute `"match.length"'
     giving the length of the matched text (or -1 for no
     match).

_N_o_t_e_:

     It is possible to compile R without support for regular
     expressions, and then these functions are not opera-
     tional.

     On the Macintosh port this function is based on the
     regex regular expression library written by Henry
     Spencer of the University of Toronto.

_S_e_e _A_l_s_o_:

     `charmatch', `pmatch', `match'.  `apropos' uses regexps
     and has nice examples.

_E_x_a_m_p_l_e_s_:

     grep("[a-z]", letters)

     txt <- c("arm","foot","lefroo", "bafoobar")
     if(any(i <- grep("foo",txt)))
        cat("`foo' appears at least once in\n",txt,"\n")
     i # 2 and 4
     txt[i]

     ## Double all 'a' or 'b's;  "\" must be escaped, i.e. `doubled'
     gsub("([ab])", "\\1_\\1_", "abc and ABC")

     txt <- c("The", "licenses", "for", "most", "software", "are",
       "designed", "to", "take", "away", "your", "freedom",
       "to", "share", "and", "change", "it.",
        "", "By", "contrast,", "the", "GNU", "General", "Public", "License",
        "is", "intended", "to", "guarantee", "your", "freedom", "to",
        "share", "and", "change", "free", "software", "--",
        "to", "make", "sure", "the", "software", "is",
        "free", "for", "all", "its", "users")
     ( i <- grep("[gu]", txt) ) # indices
     all( txt[i] == grep("[gu]", txt, value = TRUE) )
     (ot <- sub("[b-e]",".", txt))
     txt[ot != gsub("[b-e]",".", txt)]#- gsub does "global" substitution

     txt[gsub("g","#", txt) !=
         gsub("g","#", txt, ignore.case = TRUE)] # the "G" words

     regexpr("en", txt)

