통계 R의 명령어 입문 (3): t-test

Notice

Recent Posts

Recent Comments

Link

« 2024/12 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Tags more

Archives

Today

Total

관리 메뉴

獨斷論

통계 R의 명령어 입문 (3): t-test 본문

과학과 기술/R 통계

통계 R의 명령어 입문 (3): t-test

부르칸 2013. 7. 1. 16:17

이제 t-test를 R에서 어떻게 실행하는지 알아보자.

Simple t-test

가장 간단한 simple t-test를 수행해보자.

데이터는 아래와 같이 입력한다.

> the.data = c(7, 7, 7, 5, 5, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 2, 1)
> the.data
[1] 7 7 7 5 5 4 4 4 4 4 4 4 3 3 3 3 3 2 1

이제 위 the.data 평균값이 5와 얼마나 차이가 나는지 알아보려면 아래와 같이 t-test를 수행하면 된다.

아래 첫번째 수행은 평균값이 5와 같은지 다른지 알아보는 것이며

두번째 수행은 평균값이 5보다 큰지 알아보는 것이고

세번째 수행은 평균값이 5보다 작은지 알아보는 것이다.

95% 신뢰구간을 가정했을때 p-value가 0.05보다 작으면 null hypothesis가 성립하지 않는다고 보는 것에 유념하고 아래 R 코드를 봐야 할 것이다.

> t.test(the.data, mu=5)

   One Sample t-test



data: the.data
t = -2.557, df = 18, p-value = 0.01981
alternative hypothesis: true mean is not equal to 5
95 percent confidence interval:
3.274232 4.831031
sample estimates:
mean of x
4.052632

> t.test(the.data, mu=5, alternative="greater")

   One Sample t-test

data: the.data
t = -2.557, df = 18, p-value = 0.9901
alternative hypothesis: true mean is greater than 5
95 percent confidence interval:
3.410155      Inf
sample estimates:
mean of x
4.052632

> t.test(the.data, mu=5, alternative="less")

    One Sample t-test

data: the.data
t = -2.557, df = 18, p-value = 0.009904
alternative hypothesis: true mean is less than 5
95 percent confidence interval:
-Inf       4.695109
sample estimates:
mean of x
4.052632

Independent t-test

두 샘플 x1과 x2를 우선 아래와 같이 입력한다.

x1 = c(18, 25, 17, 20, 23)
x2 = c(20, 30, 22, 25, 28, 30)

독립인 두 샘플의 t-test는 아래와 같이 수행한다.

> t.test(x1, x2, var.equal=T)

Two Sample t-test

data: x1 and x2
t = -2.2395, df = 9, p-value = 0.05188
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-10.51953398 0.05286732
sample estimates:
mean of x mean of y
20.60000 25.83333

위에서 var.equal은 T(true, 참) 또는 F(false, 거짓)을 값으로 갖는 Boolean 변수인데 T라고 지정을 하면 두 샘플의 분산이 같다고 가정하는 것이며 pooled variance가 사용된다. F로 지정을 하면 두 샘플의 분산이 같지 않다고 가정하는 것이며 자유도를 계산하기 위하여 Welch 또는 Satterthwaite 근사법이 사용된다.

var.equal을 아무값으로 지정하지 않으면 F로 저절로 들어간다.

아래를 보자.

> t.test(x1, x2)

Welch Two Sample t-test

data: x1 and x2
t = -2.2903, df = 8.995, p-value = 0.04776
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-10.40273450 -0.06393217
sample estimates:
mean of x mean of y
20.60000 25.83333

One-tailed t-test

R은 항상 샘플 x1에서 x1를 빼므로 염두해두자.

one-tailed t-test는 아래와 같이 수행한다.

> t.test(x1, x2, var.equal=T, alternative="less")


     Two Sample t-test

data: x1 and x2
t = -2.2395, df = 9, p-value = 0.02594
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
    -Inf    -0.9497217
sample estimates:
mean of x mean of y
20.60000 25.83333

Dependent t-test

같은 그룹이 어떤 사건이나 처지를 받은 이후에 어떻게 변하였는지 알아보는 방법이다.

다음과 같이 데이터를 입력하고 t-test를 수행하면 된다.

> before = c(18, 17, 20, 19, 22, 24)
> after = c(20, 22, 19, 21, 25, 25)
> t.test(before, after, paired=T)

Paired t-test

data: before and after
t = -2.4495, df = 5, p-value = 0.05797
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-4.09887128 0.09887128
sample estimates:
mean of the differences
-2