What happens to a t-distribution as the degrees of freedom increase the center increases and the area in the tails increases?

I'll try to give an intuitive explanation.

Table of Contents Show

Considered in terms of the reciprocal of the denominator
Is the t-distribution the same as the Student’s t-distribution?
What’s the key difference between the t- and z-distributions?

The t-statistic* has a numerator and a denominator. For example, the statistic in the one sample t-test is

$$\frac{\bar{x}-\mu_0}{s/\sqrt{n}}$$

*(there are several, but this discussion should hopefully be general enough to cover the ones you are asking about)

Under the assumptions, the numerator has a normal distribution with mean 0 and some unknown standard deviation.

Under the same set of assumptions, the denominator is an estimate of the standard deviation of the distribution of the numerator (the standard error of the statistic on the numerator). It is independent of the numerator. Its square is a chi-square random variable divided by its degrees of freedom (which is also the d.f. of the t-distribution) times $\sigma_\text{numerator}$.

When the degrees of freedom are small, the denominator tends to be fairly right-skew. It has a high chance of being less than its mean, and a relatively good chance of being quite small. At the same time, it also has some chance of being much, much larger than its mean.

Under the assumption of normality, the numerator and denominator are independent. So if we draw randomly from the distribution of this t-statistic we have a normal random number divided by a second randomly* chosen value from a right-skew distribution that's on average around 1.

* without regard to the normal term

Because it's on the denominator, the small values in the distribution of the denominator produce very large t-values. The right-skew in the denominator make the t-statistic heavy-tailed. The right tail of the distribution, when on the denominator makes the t-distribution more sharply peaked than a normal with the same standard deviation as the t.

However, as the degrees of freedom become large, the distribution becomes much more normal-looking and much more "tight" around its mean.

As such, the effect of dividing by the denominator on the shape of the distribution of the numerator reduces as the degrees of freedom increase.

Eventually - as Slutsky's theorem might suggest to us could happen - the effect of the denominator becomes more like dividing by a constant and the distribution of the t-statistic is very close to normal.

Considered in terms of the reciprocal of the denominator

whuber suggested in comments that it might be more illuminating to look at the reciprocal of the denominator. That is, we could write our t-statistics as numerator (normal) times reciprocal-of-denominator (right-skew).

For example, our one-sample-t statistic above would become:

$${\sqrt{n}(\bar{x}-\mu_0)}\cdot{1/s}$$

Now consider the population standard deviation of the original $X_i$, $\sigma_x$. We can multiply and divide by it, like so:

$${\sqrt{n}(\bar{x}-\mu_0)/\sigma_x}\cdot{\sigma_x/s}$$

The first term is standard normal. The second term (the square root of a scaled inverse-chi-squared random variable) then scales that standard normal by values that are either larger or smaller than 1, "spreading it out".

Under the assumption of normality, the two terms in the product are independent. So if we draw randomly from the distribution of this t-statistic we have a normal random number (the first term in the product) times a second randomly-chosen value (without regard to the normal term) from a right-skew distribution that's 'typically' around 1.

When the d.f. are large, the value tends to be very close to 1, but when the df are small, it's quite skew and the spread is large, with the big right tail of this scaling factor making the tail quite fat:

The T distribution, also known as the Student’s t-distribution, is a type of probability distribution that is similar to the normal distribution with its bell shape but has heavier tails. T distributions have a greater chance for extreme values than normal distributions, hence the fatter tails.

The T distribution is a continuous probability distribution of the z-score when the estimated standard deviation is used in the denominator rather than the true standard deviation.
The T distribution, like the normal distribution, is bell-shaped and symmetric, but it has heavier tails, which means it tends to produce values that fall far from its mean.
T-tests are used in statistics to estimate significance.

Tail heaviness is determined by a parameter of the T distribution called degrees of freedom, with smaller values giving heavier tails, and with higher values making the T distribution resemble a standard normal distribution with a mean of 0, and a standard deviation of 1. The T distribution is also known as "Student's T Distribution."

Image by Sabrina Jiang © Investopedia 2020

When a sample of n observations is taken from a normally distributed population having mean M and standard deviation D, the sample mean, m, and the sample standard deviation, d, will differ from M and D because of the randomness of the sample.

A z-score can be calculated with the population standard deviation as Z = (x – M)/D, and this value has the normal distribution with mean 0 and standard deviation 1. But when using the estimated standard deviation, a t-score is calculated as T = (m – M)/{d/sqrt(n)}, the difference between d and D makes the distribution a T distribution with (n - 1) degrees of freedom rather than the normal distribution with mean 0 and standard deviation 1.

Take the following example for how t-distributions are put to use in statistical analysis. First, remember that a confidence interval for the mean is a range of values, calculated from the data, meant to capture a “population” mean. This interval is m +- t*d/sqrt(n), where t is a critical value from the T distribution.

For instance, a 95% confidence interval for the mean return of the Dow Jones Industrial Average in the 27 trading days prior to 9/11/2001, is -0.33%, (+/- 2.055) * 1.07 / sqrt(27), giving a (persistent) mean return as some number between -0.75% and +0.09%. The number 2.055, the amount of standard errors to adjust by, is found from the T distribution.

Because the T distribution has fatter tails than a normal distribution, it can be used as a model for financial returns that exhibit excess kurtosis, which will allow for a more realistic calculation of Value at Risk (VaR) in such cases.

Normal distributions are used when the population distribution is assumed to be normal. The T distribution is similar to the normal distribution, just with fatter tails. Both assume a normally distributed population. T distributions have higher kurtosis than normal distributions. The probability of getting values very far from the mean is larger with a T distribution than a normal distribution.

The T distribution can skew exactness relative to the normal distribution. Its shortcoming only arises when there’s a need for perfect normality. The T-distribution should only be used when population standard deviation is not known. If the population standard deviation is known and the sample size is large enough, the normal distribution should be used for better results.

In order to continue enjoying our site, we ask that you confirm your identity as a human. Thank you very much for your cooperation.

The t-distribution describes the standardized distances of sample means to the population mean when the population standard deviation is not known, and the observations come from a normally distributed population.

Is the t-distribution the same as the Student’s t-distribution?

Yes.

What’s the key difference between the t- and z-distributions?

The standard normal or z-distribution assumes that you know the population standard deviation. The t-distribution is based on the sample standard deviation.

The t-distribution is similar to a normal distribution. It has a precise mathematical definition. Instead of diving into complex math, let’s look at the useful properties of the t-distribution and why it is important in analyses.

Like the normal distribution, the t-distribution has a smooth shape.
Like the normal distribution, the t-distribution is symmetric. If you think about folding it in half at the mean, each side will be the same.
Like a standard normal distribution (or z-distribution), the t-distribution has a mean of zero.
The normal distribution assumes that the population standard deviation is known. The t-distribution does not make this assumption.
The t-distribution is defined by the degrees of freedom. These are related to the sample size.
The t-distribution is most useful for small sample sizes, when the population standard deviation is not known, or both.
As the sample size increases, the t-distribution becomes more similar to a normal distribution.

Consider the following graph comparing three t-distributions with a standard normal distribution:

Figure 1: Three t-distributions and a standard normal (z-) distribution.

All of the distributions have a smooth shape. All are symmetric. All have a mean of zero.

The shape of the t-distribution depends on the degrees of freedom. The curves with more degrees of freedom are taller and have thinner tails. All three t-distributions have “heavier tails” than the z-distribution.

You can see how the curves with more degrees of freedom are more like a z-distribution. Compare the pink curve with one degree of freedom to the green curve for the z-distribution. The t-distribution with one degree of freedom is shorter and has thicker tails than the z-distribution. Then compare the blue curve with 10 degrees of freedom to the green curve for the z-distribution. These two distributions are very similar.

A common rule of thumb is that for a sample size of at least 30, one can use the z-distribution in place of a t-distribution. Figure 2 below shows a t-distribution with 30 degrees of freedom and a z-distribution. The figure uses a dotted-line green curve for z, so that you can see both curves. This similarity is one reason why a z-distribution is used in statistical methods in place of a t-distribution when sample sizes are sufficiently large.

Figure 2: z-distribution and t-distribution with 30 degrees of freedom

When you perform a t-test, you check if your test statistic is a more extreme value than expected from the t-distribution.

For a two-tailed test, you look at both tails of the distribution. Figure 3 below shows the decision process for a two-tailed test. The curve is a t-distribution with 21 degrees of freedom. The value from the t-distribution with α = 0.05/2 = 0.025 is 2.080. For a two-tailed test, you reject the null hypothesis if the test statistic is larger than the absolute value of the reference value. If the test statistic value is either in the lower tail or in the upper tail, you reject the null hypothesis. If the test statistic is within the two reference lines, then you fail to reject the null hypothesis.

Figure 3: Decision process for a two-tailed test

For a one-tailed test, you look at only one tail of the distribution. For example, Figure 4 below shows the decision process for a one-tailed test. The curve is again a t-distribution with 21 degrees of freedom. For a one-tailed test, the value from the t-distribution with α = 0.05 is 1.721. You reject the null hypothesis if the test statistic is larger than the reference value. If the test statistic is below the reference line, then you fail to reject the null hypothesis.

Figure 4: Decision process for a one-tailed test

Most people use software to perform the calculations needed for t-tests. But many statistics books still show t-tables, so understanding how to use a table might be helpful. The steps below describe how to use a typical t-table.

Identify if the table is for two-tailed or one-tailed tests. Then, decide if you have a one-tailed or a two-tailed test. The columns for a t-table identify different alpha levels.
If you have a table for a one-tailed test, you can still use it for a two-tailed test. If you set α = 0.05 for your two-tailed test and have only a one-tailed table, then use the column for α = 0.025.
Identify the degrees of freedom for your data. The rows of a t-table correspond to different degrees of freedom. Most tables go up to 30 degrees of freedom and then stop. The tables assume people will use a z-distribution for larger sample sizes.
Find the cell in the table at the intersection of your α level and degrees of freedom. This is the t-distribution value. Compare your statistic to the t-distribution value and make the appropriate conclusion.