The t-distribution describes the standardized distances of sample means to the population mean when the population standard deviation is not known, and the observations come from a normally distributed population. Show Is the t-distribution the same as the Student’s t-distribution?Yes. What’s the key difference between the t- and z-distributions?The standard normal or z-distribution assumes that you know the population standard deviation. The t-distribution is based on the sample standard deviation.
The t-distribution is similar to a normal distribution. It has a precise mathematical definition. Instead of diving into complex math, let’s look at the useful properties of the t-distribution and why it is important in analyses.
Consider the following graph comparing three t-distributions with a standard normal distribution: Figure 1: Three t-distributions and a standard normal (z-) distribution.
All of the distributions have a smooth shape. All are symmetric. All have a mean of zero. The shape of the t-distribution depends on the degrees of freedom. The curves with more degrees of freedom are taller and have thinner tails. All three t-distributions have “heavier tails” than the z-distribution. You can see how the curves with more degrees of freedom are more like a z-distribution. Compare the pink curve with one degree of freedom to the green curve for the z-distribution. The t-distribution with one degree of freedom is shorter and has thicker tails than the z-distribution. Then compare the blue curve with 10 degrees of freedom to the green curve for the z-distribution. These two distributions are very similar. A common rule of thumb is that for a sample size of at least 30, one can use the z-distribution in place of a t-distribution. Figure 2 below shows a t-distribution with 30 degrees of freedom and a z-distribution. The figure uses a dotted-line green curve for z, so that you can see both curves. This similarity is one reason why a z-distribution is used in statistical methods in place of a t-distribution when sample sizes are sufficiently large. Figure 2: z-distribution and t-distribution with 30 degrees of freedom
When you perform a t-test, you check if your test statistic is a more extreme value than expected from the t-distribution. For a two-tailed test, you look at both tails of the distribution. Figure 3 below shows the decision process for a two-tailed test. The curve is a t-distribution with 21 degrees of freedom. The value from the t-distribution with α = 0.05/2 = 0.025 is 2.080. For a two-tailed test, you reject the null hypothesis if the test statistic is larger than the absolute value of the reference value. If the test statistic value is either in the lower tail or in the upper tail, you reject the null hypothesis. If the test statistic is within the two reference lines, then you fail to reject the null hypothesis. Figure 3: Decision process for a two-tailed test
For a one-tailed test, you look at only one tail of the distribution. For example, Figure 4 below shows the decision process for a one-tailed test. The curve is again a t-distribution with 21 degrees of freedom. For a one-tailed test, the value from the t-distribution with α = 0.05 is 1.721. You reject the null hypothesis if the test statistic is larger than the reference value. If the test statistic is below the reference line, then you fail to reject the null hypothesis. Figure 4: Decision process for a one-tailed test
Most people use software to perform the calculations needed for t-tests. But many statistics books still show t-tables, so understanding how to use a table might be helpful. The steps below describe how to use a typical t-table.
In order to continue enjoying our site, we ask that you confirm your identity as a human. Thank you very much for your cooperation.
The central limit theorem states that if you have a population with mean μ and standard deviation σ and take sufficiently large random samples from the population with replacement For the random samples we take from the population, we can compute the mean of the sample means: and the standard deviation of the sample means: Before illustrating the use of the Central Limit Theorem (CLT) we will first illustrate the result. In order for the result of the CLT to hold, the sample must be sufficiently large (n > 30). Again, there are two exceptions to this. If the population is normal, then the result holds for samples of any size (i..e, the sampling distribution of the sample means will be approximately normal even for samples of size less than 30). Central Limit Theorem with a Normal PopulationThe figure below illustrates a normally distributed characteristic, X, in a population in which the population mean is 75 with a standard deviation of 8. If we take simple random samples (with replacement) The mean of the sample means is 75 and the standard deviation of the sample means is 2.5, with the standard deviation of the sample means computed as follows: If we were to take samples of n=5 instead of n=10, we would get a similar distribution, but the variation among the sample means would be larger. In fact, when we did this we got a sample mean = 75 and a sample standard deviation = 3.6. Central Limit Theorem with a Dichotomous OutcomeNow suppose we measure a characteristic, X, in a population and that this characteristic is dichotomous (e.g., success of a medical procedure: yes or no) with 30% of the population classified as a success (i.e., p=0.30) as shown below. The Central Limit Theorem applies even to binomial populations like this provided that the minimum of np and n(1-p) is at least 5, where "n" refers to the sample size, and "p" is the probability of "success" on any given trial. In this case, we will take samples of n=20 with replacement, so min(np, n(1-p)) = min(20(0.3), 20(0.7)) = min(6, 14) = 6. Therefore, the criterion is met. We saw previously that the population mean and standard deviation for a binomial distribution are: Mean binomial probability: Standard deviation: The distribution of sample means based on samples of size n=20 is shown below. The mean of the sample means is and the standard deviation of the sample means is: Now, instead of taking samples of n=20, suppose we take simple random samples (with replacement) of size n=10. Note that in this scenario we do not meet the sample size requirement for the Central Limit Theorem (i.e., min(np, n(1-p)) = min(10(0.3), 10(0.7)) = min(3, 7) = 3).The distribution of sample means based on samples of size n=10 is shown on the right, and you can see that it is not quite normally distributed. The sample size must be larger in order for the distribution to approach normality. Central Limit Theorem with a Skewed DistributionThe Poisson distribution is another probability model that is useful for modeling discrete variables such as the number of events occurring during a given time interval. For example, suppose you typically receive about 4 spam emails per day, but the number varies from day to day. Today you happened to receive 5 spam emails. What is the probability of that happening, given that the typical rate is 4 per day? The Poisson probability is: Mean = μ Standard deviation = The mean for the distribution is μ (the average or typical rate), "X" is the actual number of events that occur ("successes"), and "e" is the constant approximately equal to 2.71828. So, in the example above Now let's consider another Poisson distribution. with μ=3 and σ=1.73. The distribution is shown in the figure below.
This population is not normally distributed, but the Central Limit Theorem will apply if n > 30. In fact, if we take samples of size n=30, we obtain samples distributed as shown in the first graph below with a mean of 3 and standard deviation = 0.32. In contrast, with small samples of n=10, we obtain samples distributed as shown in the lower graph. Note that n=10 does not meet the criterion for the Central Limit Theorem, and the small samples on the right give a distribution that is not quite normal. Also note that the sample standard deviation (also called the "standard error |