Previous Show
Next Measures of Center
CO-4: Distinguish among different measurement scales, choose the appropriate descriptive and inferential statistical methods based on these distinctions, and interpret the results. LO 4.4: Using appropriate graphical displays and/or numerical measures, describe the distribution of a quantitative variable in context: a) describe the overall pattern, b) describe striking deviations from the pattern LO 4.7: Define and describe the features of the distribution of one quantitative variable (shape, center, spread, outliers). Video: Measures of Center (2 videos, 6:09 total) Related SAS Tutorials
Related SPSS Tutorials
IntroductionIntuitively speaking, a numerical measure of center describes a “typical value” of the distribution. The two main numerical measures for the center of a distribution are the mean and the median. In this unit on Exploratory Data Analysis, we will be calculating these results based upon a sample and so we will often emphasize that the values calculated are the sample mean and sample median. Each one of these measures is based on a completely different idea of describing the center of a distribution. We will first present each one of the measures, and then compare their properties. MeanLO 4.8: Define and calculate the sample mean of a quantitative variable. The mean is the average of a set of observations (i.e., the sum of the observations divided by the number of observations). The mean is the average of a set of observations
We read the symbol as “x-bar.” The bar notation is commonly used to represent the sample mean, i.e. the mean of the sample. Using any appropriate letter to represent the variable (x, y, etc.), we can indicate the sample mean of this variable by adding a bar over the variable notation. EXAMPLE: Best Actress Oscar WinnersWe will continue with the Best Actress Oscar winners example (Link to the Best Actress Oscar Winners data). 34 34 26 37 42 41 35 31 41 33 30 74 33 49 38 61 21 41 26 80 43 29 33 35 45 49 39 34 26 25 35 33 The mean age of the 32 actresses is: We add all of the ages to get 1233 and divide by the number of ages which was 32 to get 38.5. We denote this result as x-bar and called the sample mean. Note that the sample mean gives a measure of center which is higher than our approximation of the center from looking at the histogram (which was 35). The reason for this will be clear soon.
EXAMPLE: World Cup SoccerOften we have large sets of data and use a frequency table to display the data more efficiently. Data were collected from the last three World Cup soccer tournaments. A total of 192 games were played. The table below lists the number of goals scored per game (not including any goals scored in shootouts). Total # Goals/GameFrequency017145251337425511637281To find the mean number of goals scored per game, we would need to find the sum of all 192 numbers, and then divide that sum by 192. Rather than add 192 numbers, we use the fact that the same numbers appear many times. For example, the number 0 appears 17 times, the number 1 appears 45 times, the number 2 appears 51 times, etc. If we add up 17 zeros, we get 0. If we add up 45 ones, we get 45. If we add up 51 twos, we get 102. Repeated addition is multiplication. Thus, the sum of the 192 numbers = 0(17) + 1(45) + 2(51) + 3(37) + 4(25) + 5(11) + 6(3) + 7(2) + 8(1) = 453. The sample mean is then 453 / 192 = 2.359. Note that, in this example, the values of 1, 2, and 3 are the most common and our average falls in this range representing the bulk of the data.
Did I Get This?: Mean MedianLO 4.9: Define and calculate the sample median of a quantitative variable. The median M is the midpoint of the distribution. It is the number such that half of the observations fall above, and half fall below. To find the median:
EXAMPLE: Median (1)For a simple visualization of the location of the median, consider the following two simple cases of n = 7 and n = 8 ordered observations, with each observation represented by a solid circle: Comments:
Did I Get This?: Median EXAMPLE: Median (2)To find the median age of the Best Actress Oscar winners, we first need to order the data. It would be useful, then, to use the stemplot, a diagram in which the data are already ordered.
Counting from the top, we find that:
Therefore, the median M = (35 + 35) / 2 = 35 Learn By Doing: Measures of Center #1 Comparing the Mean and the MedianLO 4.10: Choose the appropriate measures for a quantitative variable based upon the shape of the distribution. As we have seen, the mean and the median, the most common measures of center, each describe the center of a distribution of values in a different way.
To get a deeper understanding of the differences between these two measures of center, consider the following example. Here are two datasets: Data set A → 64 65 66 68 70 71 73Data set B → 64 65 66 68 70 71 730For dataset A, the mean is 68.1, and the median is 68. Looking at dataset B, notice that all of the observations except the last one are close together. The observation 730 is very large, and is certainly an outlier. In this case, the median is still 68, but the mean will be influenced by the high outlier, and shifted up to 162. The message that we should take from this example is: The mean is very sensitive to outliers (because it factors in their magnitude), while the median is resistant (or robust) to outliers. Interactive Applet: Comparing the Mean and Median Therefore:
Conclusions… When to use which measures?
Did I Get This?: Measures of Center Learn By Doing: Measures of Center #2 Learn By Doing: Measures of Center – Additional Practice Let’s Summarize
Tagged as: Average, CO-4, Exploratory Data Analysis, LO 4.10, LO 4.4, LO 4.7, LO 4.8, LO 4.9, Measures of Center, Numerical Measures, Sample Mean, Sample Median, When to use which measures What is measure of center property?A measure of central tendency (also referred to as measures of centre or central location) is a summary measure that attempts to describe a whole set of data with a single value that represents the middle or centre of its distribution.
What are the properties of good measures of central tendency?A good measure of central tendency should possess the following properties:. It should be rigidly defined. ... . It should be easy to understand and simple to calculate. ... . It should be based on all the observations. ... . It should be capable of further algebraic treatment. ... . It should not be unduly affected by extreme observations.. What are the 5 measures of center?The four measures of center are mean, median, mode, and midrange.. Mean – The mean is what you know as the average. ... . Median – The median is not the same thing as the mean, even though in popular parlance, the two terms are often used interchangeably. ... . Mode – The mode is the number that repeats most often in a data set.. What are the four measures of center?Measures of Center. The mean always exists.. The mean does not have to be one of the data values.. The mean uses all the data values.. The mean is affected by extreme values.. |