Paired t-test in R with Examples - Statistics Tutorial (2024)

Table of Contents hide

1 What is paired t-test ?

2 Conditions required to conduct paired t-test

2.1 Function in R for Paired t-test

2.2 Summary for the paired t-test for mean

3 How to do paired t-test in R?

4 Examples of Paired t-test in R

4.1 Example 1: Right-tailed paired t-test in R

4.2 Example 2: Left-tailed paired t-test in R

5 Paired t-test FAQ

6 Summary

In this article, we will discuss how to do a paired t-test in R with some practical examples.

What is paired t-test ?

Paired test is used when we have the two related samples. Paired test is used to check whether there is a significant difference between two population means when their data is in the form of matched pairs.

Conditions required to conduct paired t-test

Assumptions for Paired t-test are as follows:

  • The parent population from which the sample is drawn should be normal.
  • The samples should be independent of each other.
  • The sample size should be equal for both the samples, i.e. n1 = n2.
  • The dependent variable should be continuos.

Hypothesis for the paired t-test

Let μddenote the mean difference.

Null Hypothesis:

H0: μd = 0There is no difference between the two means.

Alternative Hypothesis:Three forms of alternative hypothesis are as follows:

  • Had< 0 The mean difference is less than zero. It is lower tail test (left-tailed test).
  • Ha: μd> 0 The mean difference is greater than zero. It is Upper tail test(right-tailed test).
  • Ha: μd ≠ 0 The mean difference is not equal to zero. It is called a two-tailed test.

Formula for the test statistic of the paired t-test is:

Paired t-test in R with Examples - Statistics Tutorial (1)

where:

: mean of the difference between two given sample means

n: sample size.

sd : standard deviation of d.

Function in R for Paired t-test

To perform paired t-test for the mean we will use the t.test() function in R from the stats library.

Thet.test()function uses the following basic syntax:

t.test(x, y = NULL, alternative = c("two.sided", "less", "greater"), mu = 0, paired = FALSE, var.equal = FALSE, conf.level = 0.95, ...)

where :

x,y:x and y represent the two samples datasets.

alternative:The alternative hypothesis for the test.

mu:The true value of the mean.

paired:Specify it is a paired t-test or not. Here we will write True.

var. equal:a logical variable indicates whether to treat the two variances as being equal.

conf. level:confidence level of the interval

Summary for the paired t-test for mean

Left-tailed TestRight-tailed TestTwo-tailed Test
Null HypothesisH0: μd≥ 0H0: μd≤ 0 H0: μd= 0
Alternate HypothesisHa: μd< 0 Ha: μd> 0 Ha: μd ≠ 0
Test Statistict= d̅ /(sd√ n) t= d̅ /(sd√ n) t= d̅ /(sd√ n)
Decision Rule:p-value approach(where α is level of significance)If p-value ≤α
then Reject H0
If p-value ≤α
then Reject H0
If p-value ≤α
then Reject H0
Decision Rule:Critical-value approachIf t ≤ -tα
then Reject H0
If t ≥ tα
then Reject H0
If t ≤ -tα/2or t ≥ tα/2then Reject H0

How to do paired t-test in R?

We will calculate the test statistic by using a paired t-test.

Procedure to perform paired t-test.

Step 1:Define the Null Hypothesis and Alternate Hypothesis.

Step 2:Decide the level of significance α (alpha).

Step 3:Calculate the test statistic using the t.test() function from R.

Step 4:Interpret the paired t-test results.

Step 5:Determine the rejection criteria for the given confidence level and conclude the results whether the test statistic lies in the rejection region or non-rejection region.

Let’s see practical examples that show how to use the t.test() function in R.

Examples of Paired t-test in R

Example 1: Right-tailed paired t-test in R

A training program was conducted to improve participant’s knowledge of the R language. Data of Test Results were collected from a selected sample both before and after the R training program. Test the hypothesis that the training is effective to improve participants’ knowledge of R language at a 5% level of significance.

Solution: Given data

before data : 39,43,41,32,37,40,42,40,37,38
after data : 42,45,42,43,40,44,40,43,41,40

Let’s solve this example by the step-by-step procedure.

Step 1:Define the Null Hypothesis and Alternate Hypothesis.

let μ1 be the population mean for the data before the training.

μ2 be the population mean for the data after the training.

μd = μ2 – μ1

Null Hypothesis: Both population means are equal.

H0: μd = 0 i.e. μ1 = μ2

Alternate Hypothesis: Population mean after the training is greater than the population mean before the training.

Ha: μd >0 i.e. μ2 > μ1 (right-tailed test)

Step 2:level of significance (α) = 0.05

Step 3:Calculate the test statistic using the t.test() function in R using the below code.

# Define the datasetsbefore <- c(39,43,41,32,37,40,42,40,37,38)after <- c(42,45,42,43,40,44,40,43,41,40)# Perform the paired t-testt.test(x=before,y=after,paired = TRUE,alternative = "greater")

Specify the alternative hypothesis as “greater” because we are performing a right-tailed test. The results are as follows.

#ResultsPaired t-testdata: before and aftert = -2.9876, df = 9, p-value = 0.9924alternative hypothesis: true difference in means is greater than 095 percent confidence interval: -5.002085 Infsample estimates:mean of the differences -3.1 

Step 4:Interpret the paired test results.

How to interpret the pairedt-test results in R?

Let’s see the interpretation of the paired t-test results in R.

data: This gives information about the vector used in the paired t-test. x represents the data set before the training and y represents the data set after the training.

t: It is the test statistic of the t-test. In our case test statistic = -2.9876

df: It is the degree of freedom for the t-test statistic. In our case df=9

p-value: This is the p-value corresponding to t-test statistic i.e. – 2.9876 and degree of freedom i.e. 9. In our case, the p-value is 0.9924.

alternative: It is the alternative hypothesis used for the t-test. In our case, an alternative hypothesis is a population mean after the training is greater than the population mean before the training. i.e right tailed.

95 percent confidence interval:This gives us a 95% confidence interval for the true mean. Here the 95% confidence interval is [-5.002085,].

sample estimates: It gives the mean of the difference. In our case sample mean of the difference is -3.1.

Step 5:Determine the rejection criteria for the given confidence level and conclude the results whether the test statistic lies in the rejection region or non-rejection region.

Conclusion:

Since the p-value[ 0.9924] is not less than the level of significance (α) = 0.05, we fail to reject the null hypothesis.

This means we do not have sufficient evidence to say that the training is effective for the students.

Example 2: Left-tailed paired t-test in R

For instance, let’s say that we work at a large drug company, and we are testing a new drug A, which helps to reduce diabetes. We find 1000 individuals with high diabetes of average 140 mg/dL blood sugar level with a standard deviation of 10 mg/dL, and we provide them the drug A for a month, and then measure their blood sugar level again. We find that the mean blood sugar level has decreased to 130 mg/dL with a standard deviation of 8 mg/dL.

Solution:

Let’s solve this example by the step-by-step procedure.

Step 1:Define the Null Hypothesis and Alternate Hypothesis.

let μ1 be the population mean of blood sugar level before taking the drug A.

μ2 be the population mean of blood sugar level after taking the drug A .

μd = μ2 – μ1

Null Hypothesis: Both population means are equal.

H0: μd = 0 i.e. μ1 = μ2

Alternate Hypothesis: Population mean after taking the drug A is less than the population mean before taking the drug A.

Ha: μd <0 i.e. μ2 < μ1 (left-tailed test)

Step 2:level of significance (α) = 0.05

Step 3:Calculate the test statistic using the t.test() function in R using the below code.

# Using seed function to generate the same random number every time with the given seed valueset.seed(1000)#create a the pre dataset with 1000 valuespre_Treatment <- c(rnorm(1000, mean = 140, sd = 10))#create a the post dataset with 1000 valuespost_Treatment <- c(rnorm(1000, mean = 130, sd = 8))# Perform the paired t-testt.test(pre_Treatment, post_Treatment, paired = TRUE,alternative = "less")

Specify the alternative hypothesis as “less” because we are performing a left-tailed test. The results are as follows.

#ResultsPaired t-testdata: pre_Treatment and post_Treatmentt = 25.432, df = 999, p-value = 1alternative hypothesis: true difference in means is less than 095 percent confidence interval: -Inf 10.50804sample estimates:mean of the differences 9.869133 

Step 4:Interpret the paired test results.

How to interpret the pairedt-test results in R?

Let’s see the interpretation of the paired t-test results in R.

data: This gives information about the vector used in the paired t-test. x represents the data set before the training and y represents the data set after the training.

t: It is the test statistic of the t-test. In our case test statistic = 25.432

df: It is the degree of freedom for the t-test statistic. In our case, df=999

p-value: This is the p-value corresponding to t-test statistic i.e. 25.432 and degree of freedom i.e. 999. In our case, the p-value is 1.

alternative: It is the alternative hypothesis used for the t-test. In our case, an alternative hypothesis is a population mean after taking the drug A is less than the population mean before taking the drug A. i.e left tailed.

95 percent confidence interval:This gives us a 95% confidence interval for the true mean. Here the 95% confidence interval is [-,10.50804].

sample estimates: It gives the mean of the difference. In our case, the sample mean of the difference is 9.869133.

Step 5:Determine the rejection criteria for the given confidence level and conclude the results whether the test statistic lies in the rejection region or non-rejection region.

Conclusion:

Since the p-value[1] is greater than the level of significance (α) = 0.05, we fail to reject the null hypothesis.

This means we do not have sufficient evidence to say that drug A is effective for the patients.

Paired t-test FAQ

Which R function do we use to perform a paired t-test?

t.test() from the R stats library is used to perform a paired t-test.

Summary

I hope you found the above article on Paired t-test in R with Examples informative and educational.

Paired t-test in R with Examples - Statistics Tutorial (2024)

References

Top Articles
Latest Posts
Article information

Author: Arline Emard IV

Last Updated:

Views: 5694

Rating: 4.1 / 5 (72 voted)

Reviews: 95% of readers found this page helpful

Author information

Name: Arline Emard IV

Birthday: 1996-07-10

Address: 8912 Hintz Shore, West Louie, AZ 69363-0747

Phone: +13454700762376

Job: Administration Technician

Hobby: Paintball, Horseback riding, Cycling, Running, Macrame, Playing musical instruments, Soapmaking

Introduction: My name is Arline Emard IV, I am a cheerful, gorgeous, colorful, joyous, excited, super, inquisitive person who loves writing and wants to share my knowledge and understanding with you.