The document provides an overview of t-tests, which are statistical methods used to compare the means of two groups to determine if they are significantly different. It explains the characteristics and applications of t-distributions, including independent and dependent samples t-tests, and outlines the procedures for conducting these tests in SPSS. Additionally, it covers the interpretation of results, including the significance of p-values and Levene’s test for equality of variances.
The document provides an overview of t-tests, which are statistical methods used to compare the means of two groups to determine if they are significantly different. It explains the characteristics and applications of t-distributions, including independent and dependent samples t-tests, and outlines the procedures for conducting these tests in SPSS. Additionally, it covers the interpretation of results, including the significance of p-values and Levene’s test for equality of variances.
What is t test ◻ The common-use definition or description of t tests is simply comparing two means to see if they are significantly different from each other.
◻ The two groups could be, for example, patients
who received drug A once and drug B once, and you want to know if there is a difference in blood pressure between these two groups. t Distributions ◻ When the population standard deviation (σ) is unknown or the sample size (n) is small (typically n<30), the normal distribution-based z-scores are no longer accurate. ◻ The t-distribution adjusts for this by accounting for the sample size, making it more reliable for smaller datasets or unknown population parameters. Characteristics of t-Distributions ◻ Shape: The t-distribution is bell-shaped like the normal distribution but it has heavier tails, meaning it is more spread out and allows for more extreme values.. ◻ Dependence on Sample Size: The exact shape of the t-distribution depends on degrees of freedom (df), which is related to the sample size: df=n−1 for a single sample. The smaller the sample size (or df), result in wider and flatter distribution. As n becomes large or as df increases, the t-distribution approaches the normal distribution. (n>120) When to Use the t-Distribution ◻ Small Sample Sizes (n<120): Use the t-distribution when your sample size is small, and you are estimating probabilities or performing hypothesis tests. Example: Estimating the mean height of a population with a sample of n=25. ◻ Population Standard Deviation Unknown: If the population standard deviation (σ) is unknown, you estimate it with the sample standard deviation (s). This introduces variability, which the t-distribution accounts for. The Independent Samples t Test ◻ You use this test when you want to compare the means of two independent samples on a given variable. For example, if you wanted to compare the average height of 50 randomly selected men to that of 50 randomly selected women, you would conduct an independent samples t test. ◻ there is no overlap between these two samples, these groups are independent ◻ To conduct an independent samples t test, you need one categorical or nominal independent variable and one continuous or intervally scaled dependent variable ◻ For example, we may want to know if the average height of people (height is the dependent, continuous variable) depends on whether the person is a man or a woman (gender of the person is the independent, categorical variable). Dependent Samples t Test ◻ A dependent samples t test is also used to compare two means on a single dependent variable. ◻ Unlike the independent samples test, however, a dependent samples t test is used to compare the means of a single sample or of two matched or paired samples. Examples ◻ If a group of students took a math test in March and that same group of students took the same math test two months later in May, we could compare their average scores on the two test dates using a dependent samples t test. ◻ Or, suppose that we wanted to compare a sample of boys’ Scholastic Aptitude Test (SAT) scores with their fathers’ SAT scores. ◻ In this example, each boy in our study would be matched with his father. In both of these examples, each score is matched, or paired with, a second score. Because of this pairing, we say that the scores are dependent on each other, and a dependent samples t test is warranted. Conceptual Issues with the Independent Samples t Test ◻ The most complicated conceptual issue in the independent samples t test involves the standard error for the test ◻ It is designed to answer the most simple question. Do two independent samples differ from each other significantly in their average scores on some variable? ◻ a random sample of 50 men differs significantly from a random sample of 50 women in their average enjoyment of a new television show ◻ In male sample average rating is 7.5 and my sample of women gave the show an average rating of 6.5 ◻ The question is whether they differed significantly in their average enjoyment of the show? ◻ In other words, is this difference of 1.0 between my two samples large enough to represent a real difference between the populations of men and women? ◻ “Is the difference between the means of these two samples statistically significant?” ◻ You must know how much difference you should expect to see between two samples of this size drawn from these two populations? ◻ The critical question here is this: What is the average expected difference between the means of two samples of this size (i.e., 50 each) selected randomly from these two populations? In other words, what is the standard error of the difference between the means? The Standard Error of the Difference between Independent Sample Means ◻ The standard error of the difference measures: How much difference we should expect, on average, between two sample means selected randomly from a population. ◻ The standard error of the difference between independent sample means is a little bit more complex than the standard error of the mean ◻ Unlike the standard error of a single sample, this involves two samples, so we need to combine their standard errors. ◻ If the two samples: ◻ Have unequal sizes or ◻ Different standard deviations, the formula must be adjusted to account for these difference Formula Determining the Significance of the t Value for an Independent Samples t Test
◻ It involves checking if the difference between two
group means is likely due to chance or represents a real difference. Here's how it works in simple terms: ◻ Calculate the t-value: This is a number that tells us how much the two groups differ, considering the variation in the data. ◻ Find the p-value: The p-value tells us the probability of seeing such a difference (or more extreme) if the groups are truly the same. ◻ Compare the p-value to the significance level (α): The most common α is 0.05 (5%). If the p-value is less than 0.05, it means the difference is statistically significant (not likely due to chance). If the p-value is greater than 0.05, the difference is not statistically significant (it could just be random). Paired or Dependent Samples t Tests in Depth ◻ The goal of the dependent samples t-test is to determine if an observed difference in a sample reflects a true difference in the population. ◻ Example: ◻ A factory owner compares the productivity of 30 employees before and after a 2-week vacation. Pre-vacation average: 250 widgets. Post-vacation average: 300 widgets. Observed difference: 50 widgets. ◻ Key Question: ◻ Does this 50-widget difference represent a real improvement in the productivity of all employees, or is it just due to chance? ◻ Comparison with Independent Samples t-Test: Instead of comparing two different groups, a dependent samples t-test compares the same group on two occasions (e.g., pre-vacation vs. post-vacation). ◻ Another difference between dependent and independent samples t tests can be found in the calculation of the degrees of freedom ◻ Whereas we had to add the two samples together and subtract 2 in the independent samples formula, for dependent samples we find the number of pairs of scores and subtract 1 T test procedure in SPSS ◻ Analyze, Compare Means, Independent Samples T-Test ◻ Make the variable for which you want to compute a mean the “Test Variable” ◻ Make the variable that defines the two groups the “Grouping Variable” ◻ Click “Define Groups”, then enter the value that each group has on the “Grouping Variable” What to remember ◻ The t-test for the difference in means is an hypothesis test that tests the null hypothesis that the means for both groups are equal, versus the alternative hypothesis that the means are not equal (2-tail) or that the mean for one of the groups is larger than the mean for the other group (1-tail). ◻ To interpret the t-test results, all you need to find on the output is the p-value for the test. ◻ To do an hypothesis test at a specific alpha (significance) level, just compare the p-value on the output (labeled as a “Sig.” value on the SPSS output) to the chosen alpha level. Example SPSS Output for T-Test for Difference in Means Interpretation of output ◻ Note the mean for each of the two groups in the “Group Statistics” section. ◻ This output shows that the average weight for European cars is 2431 pounds, versus 2221 pounds for Japanese cars. ◻ To determine whether this difference in means is statistically significant, we examine the p-value for the t-test. ◻ The p-value is labeled as “Sig. (2-tailed)” in the SPSS output, which stands for the significance level of the test. How to Identify the Correct p-value:
◻ In the “Independent Samples Test” output, look
under the section labeled “t-test for Equality of Means.” ◻ Focus on the column labeled “Sig. (2-tailed)” — this gives the p-value for the t-test. ◻ Do not confuse this with the column labeled “Sig.” in the Levene’s Test for Equality of Variances section, which tests for equal variances but is not the p-value for the t-test itself. Levene’s Test for Equality of Variances (choosing correct row) ◻ Levene’s Test for Equality of Variances is a test that is used to check whether the variances (spread of data) of two groups are equal or not. ◻ Equal Variances Assumed ◻ This assumes that the variability (spread) of data is equal for both groups (weights is similar for European and Japanese vehicles). ◻ If the Sig. (p-value) from Levene’s test is greater than 0.05, it indicates no significant difference in variances, so we assume the variances are equal. In such cases, use the “Equal variances assumed” row to interpret the t-test results. Read from the top row. ◻ If the Sig. Value (p-value) is Less Than 0.05 ◻ If the p-value is ≤ 0.05, it means there is a significant difference in the variances, so the assumption of equal variances is violated. ◻ In this case, you use the “Equal variances not assumed” row, which adjusts for this difference and gives more reliable results. Interpretation of the Results ◻ The p-value (Sig. 2-tailed) in the second row is 0.002. ◻ This indicates that the difference in mean weights between European and Japanese cars is statistically significant at the 0.1, 0.05, and 0.01 levels. ◻ This means there is strong evidence that the difference in average weights is not due to random chance.