Factorial Between-Subjects ANOVA - 두번째 (통계 R 초급

Notice

Recent Posts

Recent Comments

Link

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

獨斷論

Factorial Between-Subjects ANOVA - 두번째 (통계 R 초급 - 9) 본문

과학과 기술/R 통계

Factorial Between-Subjects ANOVA - 두번째 (통계 R 초급 - 9)

부르칸 2013. 11. 26. 06:22

앞서 해봤던 Factorial between-subjects ANOVA 첫번째에 이어서 두번째 시간에서는 본격적으로 ANOVA를 수행해 보기로 하자.

우선 R에서 어떻게 ANOVA 모델을 입력하는지 그 대략을 살펴보면 아래와 같다.

symbol	example	meaning
+	+ x	include this variable
-	- x	delete this variable
:	x : z	include the interaction between these variables
*	x * z	include these variables and the interactions between them
/	x / z	nesting: include z nested within x
\|	x \| z	conditioning: include x given z
^	(u + v + w)^3	include these variables and all interactions up to three way
poly	poly(x,3)	polynomial regression: orthogonal polynomials
Error	Error(a/b)	specify the error term
I	I(x*z)	as is: include a new variable consisting of these variables multiplied
1	- 1	intercept: delete the intercept (regress through the origin)

따라서 x1과 x2의 두 변수를 독립변수로 보고 ANOVA를 수행한다면

y ~ x1 + x2 라고 하면 되고

두 변수의 intereaction까지 고려한다면

y ~ x1 * x2 라고 입력하면 된다.

앞서 Factorial between-subjects ANOVA 첫번째에서 행했던 데이터와 코드를 없앴다면 아래와 같이 수행하면 되고 그렇지 않다면 아래 3줄은 생략하면 된다.

> data(ToothGrowth)

> ToothGrowth $dose = factor(ToothGrowth$ dose, levels=c(0.5, 1.0, 2.0), labels=c("low", "med", "high"))

> str(ToothGrowth)

'data.frame': 60 obs. of 3 variables:

$ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...

$ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...

$ dose: Factor w/ 3 levels "low","med","high": 1 1 1 1 1 1 1 1 1 1 ...

ANOVA를 수행하기 위한 R script는 다음과 같다.

> aov.out = aov(len ~ supp * dose, data = ToothGrowth)

> summary(aov.out)

Df Sum Sq Mean Sq F value Pr(>F)

supp 1 205.3 205.3 15.572 0.000231 ***

dose 2 2426.4 1213.2 92.000 < 2e-16 ***

supp:dose 2 108.3 54.2 4.107 0.021860 *

Residuals 54 712.1 13.2

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

위 결과로부터 main effect인 supp과 dose는 모두 p < 0.05이므로 significant하고 interaction effect도 0.02 < 0.05로 significant하다고 볼수 있다.

그렇다면 실제적으로 어떤 그룹사이의 차이가 ANOVA 결과를 significant하게 만들었는지는 post hoc 테스트를 수행해 보아야만 한다.

> TukeyHSD(aov.out)

Tukey multiple comparisons of means

95% family-wise confidence level

Fit: aov(formula = len ~ supp * dose, data = ToothGrowth)

$supp

diff lwr upr p adj

VC-OJ -3.7 -5.579828 -1.820172 0.0002312

$dose

diff lwr upr p adj

med-low 9.130 6.362488 11.897512 0.0e+00

high-low 15.495 12.727488 18.262512 0.0e+00

high-med 6.365 3.597488 9.132512 2.7e-06

$ $supp:dose$

diff lwr upr p adj

VC:low-OJ:low -5.25 -10.048124 -0.4518762 0.0242521

OJ:med-OJ:low 9.47 4.671876 14.2681238 0.0000046

VC:med-OJ:low 3.54 -1.258124 8.3381238 0.2640208

OJ:high-OJ:low 12.83 8.031876 17.6281238 0.0000000

VC:high-OJ:low 12.91 8.111876 17.7081238 0.0000000

OJ:med-VC:low 14.72 9.921876 19.5181238 0.0000000

VC:med-VC:low 8.79 3.991876 13.5881238 0.0000210

OJ:high-VC:low 18.08 13.281876 22.8781238 0.0000000

VC:high-VC:low 18.16 13.361876 22.9581238 0.0000000

VC:med-OJ:med -5.93 -10.728124 -1.1318762 0.0073930

OJ:high-OJ:med 3.36 -1.438124 8.1581238 0.3187361

VC:high-OJ:med 3.44 -1.358124 8.2381238 0.2936430

OJ:high-VC:med 9.29 4.491876 14.0881238 0.0000069

VC:high-VC:med 9.37 4.571876 14.1681238 0.0000058

VC:high-OJ:high 0.08 -4.718124 4.8781238 1.0000000