What is the relationship between the two variables represented in the table?

Is there a relationship between plant height and the amount of fertilizer used?� Can you explain this?

These examples of seeking a relationship between variables can be quantified by using methods of statistical analysis called Correlation and Regression.

Correlation analysis seeks to identify (by a single number) the degree to which there is a (linear) relation between the numbers in sets of data pairs.� The correlation coefficient of a set of data pairs �with x- and y-means and respectively is

What is the relationship between the two variables represented in the table?

You don�t need to worry about computing this number; it�s easy to use a computer to calculate it.� The interpretation of this number is more important � it is somewhere between �1 and 1.� The closer r is to 1, the more positively correlated are the sets of numbers in the sense that an increase in x corresponds to a proportional increase in y; similarly with decreases in x corresponding to proportional decreases in y.� On the other hand, if r is close to �1, then increases in x correspond to decreases in y and decreases in x correspond to increases in y, so we say that x and y are negatively correlated.� Finally, if r is close to zero, there is little if any relationship between the variables � we say they are uncorrelated.

Consider the earlier graphs from the �ball-rolling� and �fertilizer� experiments:

  1. In the graph of time of rolling vs. angle of ramp as the angle increases, does the rolling time generally
    1. increase, decrease, or change in an unrelated fashion?
    2. Explain your answer from the graph.
    3. The correlation coefficient for this data turns out to be -.84.� Does this agree with your answers above?� Explain.
  2. In the graph of plant height vs. fertilizer concentration, as the amount of fertilizer per square meter increases, does plant height generally
    1. increase, decrease, or change in an unrelated fashion?
    2. Explain your answer from the graph.
    3. The correlation coefficient for this data turns out to be .37.� Does this agree with your answers above?� Explain.
  3. Consider the graph that represents the weight (lb.) vs. height (in.) for players on last year�s Cincinnati Bengals football team

What is the relationship between the two variables represented in the table?

As the height of players increases, does the weight generally

    1. increase, decrease, or change in an unrelated fashion?
    2. Explain your answer from the graph.
    3. Guess at the number below that you think best represents the correlation coefficient for this data?� Explain your guess.

                                                               i.      -.75����������������� ii. .03��� � ��������� iii. .73 ������������� iv. .99

  1. Consider the graph generated from Dr. Denice Robertson�s research on lobsters and their production of eggs.� She has measured the number of eggs produced by a lobster and the lobster�s length (mm.).� Her data is graphed below

What is the relationship between the two variables represented in the table?

As the length of the lobster increases, does the number of eggs produced generally

    1. increase, decrease, or change in an unrelated fashion?
    2. Explain your answer from the graph.
    3. Guess at the number below that you think best represents the correlation coefficient for this data?� Explain your guess

                                                               i.      -.89����������������� ii. -.13� � ��������� iii. .25 ������������� iv. .91

Regression

Regression analysis is used to determine if a relationship exists between two variables. To do this a line is created that best fits a set of data pairs.� We will use linear regression which seeks a line with equation �that �best fits� the data.� The term �Best fits� has a precise mathematical meaning that we can think of as �minimizing the distances to the line for each data point�.� In addition to an equation for the line, the regression analysis calculates a p-value and an R2 value (see below for an explanation of each).

1)Generation of the regression line and equation for the line:

For example, if a computer program for doing regression is applied to the data from the �Ball rolling experiment� the best fitting line is shown on the graph below:

What is the relationship between the two variables represented in the table?
.

It will turn out that any other line will give a larger overall distance to the points than this line does.

You can frequently estimate the equation of the regression line (y= mx + b) by estimating its slope (m)

What is the relationship between the two variables represented in the table?
�(i.e.
What is the relationship between the two variables represented in the table?
) and its y-intercept (b) (i.e. the value of the value of y where the line crosses the y-axis when x = 0). In a regression graph the x-axis is the independent variable and y-axis is the dependent variable.

From the graph above, we could estimate that the line has y-intercept close to 6 because if you continue to draw the line out, it crosses the y-axis near 6.

To determine the slope you must first choose two points on the line�these are not existing data points, but points of your choice. The easiest and usually more accurate method is to use the grid lines as your guide in choosing your values on the x-axis and then estimate your y-values. So, for the above graph choose these two sets of points (20,4.5) and (60, 1.25). It is best to spread out the points you choose, one from either end of your line. It is especially important to remember to choose two points ON THE LINE because you are trying to estimate an equation for the line itself, NOT your data points. Using the points we chose, plug the numbers into the equation for the slope.

What is the relationship between the two variables represented in the table?
.� Recognize that this is just a guess based on �eyeballing the graph�. Now plug your values for slope and the y-intercept in the equation y=mx+b and you will get:

y=-0.081 x+ 6.

2) Generation of R2 value

When you do regression analysis using a computer program, you�ll sometime see some indication of the coefficient of determination or �goodness of fit�,

What is the relationship between the two variables represented in the table?
, where �is the measured value and where is the value of the regression line evaluated at �and where n is the sample size.� R2 is simply an indication of how well your data points fit the regression line. It is used to determine if you can use your equation of the line to make any further predictions about the relationship between your variables. R2 values fall between 0 and 1. If the R2 value is closer to 1, it means more of your data points fall on or very near the regression line.

3) Generation of a p-value

Using the computer will allow you to calculate a p-value for your relationship. The p-value allows you to decide whether to accept or reject your null hypothesis. If your p-value is greater than 0.05 there is NO significant relationship and you would accept your null hypothesis. If your p-value is less than 0.05 there IS a significant relationship and you would reject your null hypothesis.

What is the relationship between two variables?

What is Correlation? Correlation is a statistical technique that is used to measure and describe a relationship between two variables. Usually the two variables are simply observed, not manipulated. The correlation requires two scores from the same individuals.

How do you show the relationship between two variables?

A scatterplot shows the relationship between two quantitative variables measured for the same individuals. The values of one variable appear on the horizontal axis, and the values of the other variable appear on the vertical axis. Each individual in the data appears as a point on the graph.

Which table shows the relationship between two variables in a tabular form?

Cross tabulation allows us to look at the relationship between two variables by organizing them in a table. This is called bivariate analysis.

What is an example of a relationship between two variables?

Graphically we use scatterplots to display two quantitative variables. Examples are age, height, weight (i.e. things that are measured). One variable is categorical and the other is quantitative, for instance height and gender.