Exploring Relationships Between Variables: From ANOVA to Correlation
Statistical analysis provides various tools to examine relationships between different types of variables. Each tool is tailored to specific data types and relationships. Previously, we explored Analysis of Variance (ANOVA), which examines the relationship between a categorical explanatory variable and a quantitative response variable, and the Chi-Square Test of Independence, which assesses relationships between two categorical variables. Now, we shift focus to analyzing relationships between two quantitative variables using the Pearson Correlation.
Scatterplots: Visualizing Quantitative Relationships
Before diving into correlation, scatterplots offer an intuitive way to visualize the relationship between two quantitative variables. In a scatterplot:
- The explanatory variable (X) is plotted on the horizontal axis.
- The response variable (Y) is plotted on the vertical axis.
Each individual in the dataset is represented as a single point, determined by their xxx-value (explanatory variable) and yyy-value (response variable). Scatterplots help reveal the overall pattern of the relationship, which can be described in terms of direction, form, and strength.

Describing Scatterplots: Direction, Form, and Strength
Direction
The direction of a relationship indicates how changes in one variable correspond to changes in the other:
- Positive Direction: An increase in one variable is associated with an increase in the other.
- Negative Direction: An increase in one variable is associated with a decrease in the other.
- No Direction: No clear relationship exists between the variables.
Form
The form describes the general shape of the scatterplot:
- Linear: Points roughly follow a straight line.
- Curvilinear: Points cluster around a curved line. Other forms may exist, but for Pearson Correlation, only linear relationships are considered.
Strength
Strength refers to how closely the data points follow the identified form:
- Strong Relationship: Points are tightly clustered along the line.
- Weak Relationship: Points are more scattered and deviate significantly from the line.
While visual inspection provides an initial sense of strength, it is subjective and prone to error. A numerical measure is required for precise evaluation.
Pearson Correlation: Measuring Linear Relationships
The Pearson Correlation Coefficient (r) quantifies the strength and direction of a linear relationship between two quantitative variables. Key properties of r include:
- Range: The value of r lies between −1-1−1 and +1+1+1.
- Positive r: Indicates a positive relationship (as XXX increases, YYY increases).
- Negative r: Indicates a negative relationship (as XXX increases, YYY decreases).
- Magnitude:
- Values near 000: Indicate a weak relationship.
- Values near −1-1−1 or +1+1+1: Indicate a strong relationship.
For example:
- An r value of +0.8+0.8+0.8 suggests a strong positive linear relationship.
- An r value of −0.3-0.3−0.3 suggests a weak negative linear relationship.
Conclusion: From Visual Patterns to Numerical Precision
Scatterplots provide a starting point for examining relationships between quantitative variables, offering visual insight into direction, form, and strength. However, the Pearson Correlation Coefficient offers a precise, numerical measure of the strength and direction of linear relationships. This combination of graphical and numerical tools equips researchers with a comprehensive approach to analyzing quantitative data relationships.
More Articles

9. Meet Wide and Long Data: Step by Step
This reading outlines the steps the instructor performs in the following lecture, Meet wide and long data. In this lecture,...
Learn More >