

anscombe {base}                              R Documentation

_A_n_s_c_o_m_b_e_'_s _Q_u_a_r_t_e_t _o_f _`_`_I_d_e_n_t_i_c_a_l_'_' _S_i_m_p_l_e _L_i_n_e_a_r _R_e_g_r_e_s_-
_s_i_o_n_s

_D_e_s_c_r_i_p_t_i_o_n_:

     Four x-y datasets which have the same traditional sta-
     tistical properties (mean, variance, correlation,
     regression line, etc.), yet are quite different.

_U_s_a_g_e_:

     data(anscombe)

_F_o_r_m_a_t_:

     A data frame with 11 observations on 8 variables.

      x1 == x2 == x3     the integers 4:14, specially arranged
                 x4      values 8 and 19
      y1, y2, y3, y4     numbers in (3, 12.5)
                         mean(y<j>) = 7.5 and st.dev = 2.03

_S_o_u_r_c_e_:

     Edward R. Tufte (1989).  The Visual Display of Quanti-
     tative Information; Graphics Press, p.13-14.

_R_e_f_e_r_e_n_c_e_s_:

     Francis J. Anscombe (1973). Graphs in Statistical Anal-
     ysis; American Statistician, 27, 17-21.

_E_x_a_m_p_l_e_s_:

     data(anscombe)
     summary(anscombe)

     ##-- now some "magic" to do the 4 regressions in a loop:
     ff <- y ~ x
     for(i in 1:4) {
       ff[2:3] <- lapply(paste(c("y","x"), i, sep=""), as.name)
       ## or   ff[[2]] <- as.name(paste("y", i, sep=""))
       ##      ff[[3]] <- as.name(paste("x", i, sep=""))
       assign(paste("lm.",i,sep=""), lmi <- lm(ff, data= anscombe))
       print(anova(lmi))
     }

     ## See how close they are (numerically!)
     sapply(objects(pat="lm.[1-4]$"), function(n) coef(get(n)))
     lapply(objects(pat="lm.[1-4]$"), function(n) summary(get(n))$coef)

     ## Now, do what you should have done in the first place: PLOTS
     op <- par(mfrow=c(2,2), mar=.1+c(4,4,1,1), oma= c(0,0,2,0))
     for(i in 1:4) {
       ff[2:3] <- lapply(paste(c("y","x"), i, sep=""), as.name)
       plot(ff, data =anscombe, col="red", pch=21, bg = "orange", cex = 1.2,
            xlim=c(3,19), ylim=c(3,13))
       abline(get(paste("lm.",i,sep="")), col="blue")
     }
     mtext("Anscombe's  4  Regression data sets", outer = TRUE, cex=1.5)
     par(op)

