📚 StudyOS CBSE Class 5–12 AI Tutor

Correlation

NCERT Class 11 · Economics Based on NCERT Class 11 Economics textbook · Free CBSE study kit

Chapter Notes

CORRELATION: COMPREHENSIVE CBSE CLASS 11 ECONOMICS NOTES

INTRODUCTION AND MEANING OF CORRELATION

**Correlation** is a statistical technique that examines and measures the relationship between two variables systematically. It helps determine whether changes in one variable are associated with changes in another variable, and the direction and intensity of such relationship.

The term "correlation" comes from the concept of **covariation** — the tendency of two variables to vary together. Key questions that correlation analysis addresses:

  • Is there any relationship between two variables?
  • If one variable changes, does the other also change?
  • Do both variables move in the same direction?
  • How strong is the relationship?
  • **Important Distinction**: Correlation measures **covariation, NOT causation**. This is the most critical concept in correlation analysis. Just because two variables are correlated does not mean one causes the other. For example, ice-cream sales and drowning deaths both increase with temperature, but ice-cream consumption does not cause drowning. Temperature is the underlying cause affecting both variables.

    TYPES OF RELATIONSHIPS

    Relationships between variables can be classified into three categories:

    Relationships with Clear Cause-and-Effect Interpretation

    These relationships have logical economic or physical explanations. Example: Agricultural productivity and rainfall — low rainfall causes low productivity. The relationship between quantity demanded and price of a commodity (demand curve) has clear theoretical justification.

    Coincidental Relationships

    Some relationships exist but cannot be meaningfully explained causally. Example: The relationship between arrival of migratory birds in a sanctuary and local birth rates shows no causal connection; it is pure coincidence. Similarly, shoe size and money in your pocket may be correlated, but the relationship has no real meaning.

    Spurious Relationships (Caused by Third Variable)

    A third variable's impact on two variables creates a false apparent relationship between them. **Example**: Brisk ice-cream sales correlate positively with deaths due to drowning. However, rising temperature causes BOTH — more ice-cream consumption AND more people swimming (leading to more drowning deaths). Temperature is the true causal variable.

    This distinction is crucial for **policy analysis** — identifying spurious correlations prevents governments from making incorrect policy decisions based on misleading relationships.

    TYPES OF CORRELATION

    Positive Correlation

    **Definition**: Positive correlation occurs when two variables move together in the same direction. When one variable increases, the other also increases; when one decreases, the other also decreases.

    **Examples**:

  • Income and consumption: As income rises, consumption also rises
  • Temperature and ice-cream sales: Higher temperature leads to higher ice-cream sales
  • Education and yield per acre: More educated farmers achieve higher agricultural yields
  • Money supply and price index: Increase in money supply leads to increase in price index
  • **Economic Interpretation**: In positive correlation, both variables respond to similar economic conditions or one variable facilitates growth in the other.

    Negative Correlation

    **Definition**: Negative correlation occurs when two variables move in opposite directions. When one variable increases, the other decreases, and vice versa.

    **Examples**:

  • Price and quantity demanded: When apple prices fall, demand increases; when prices rise, demand decreases
  • Study hours and failure chances: More study hours lead to lower chances of failure; less study leads to higher failure probability
  • Supply of tomatoes and price: As tomato supply increases in market, price drops (from Rs 40/kg to Rs 4/kg)
  • Interest rates and loan demand: Higher interest rates reduce demand for loans
  • **Economic Interpretation**: Negative correlation reflects inverse economic relationships fundamental to demand-supply theory and cost-benefit analysis.

    No Correlation

    When variables show no consistent relationship pattern — changes in one variable are not associated with predictable changes in the other.

    TECHNIQUES FOR MEASURING CORRELATION

    Three main tools measure correlation: **scatter diagrams** (visual), **Karl Pearson's coefficient** (for numerical data), and **Spearman's rank correlation** (for ranked data).

    SCATTER DIAGRAMS

    A **scatter diagram** is a graphical technique that plots values of two variables as points on graph paper to visually examine the form of relationship without calculating numerical values.

    Construction and Interpretation

    Plot the values of variable X on the horizontal axis and variable Y on the vertical axis. Each pair of observations (X, Y) becomes a point on the graph.

    **Interpretation based on point distribution**:

  • **Upward rising scatter around a line**: Positive correlation — as X increases, Y increases
  • **Downward sloping scatter around a line**: Negative correlation — as X increases, Y decreases
  • **Random scattered points with no pattern**: No correlation — no systematic relationship
  • **All points lie exactly on a straight line sloping upward**: Perfect positive correlation (r = +1)
  • **All points lie exactly on a straight line sloping downward**: Perfect negative correlation (r = –1)
  • **Measuring intensity from scatter diagrams**:

  • If points lie **close to a line**: Strong/high correlation
  • If points are **widely dispersed**: Weak/low correlation
  • If points **show clear linear pattern**: Linear relationship exists
  • If points show **curved pattern**: Non-linear relationship (Karl Pearson's formula not suitable)
  • Figures 6.1-6.7 Analysis

  • **Fig 6.1 (Positive Correlation)**: Points scatter around upward-sloping line
  • **Fig 6.2 (Negative Correlation)**: Points scatter around downward-sloping line
  • **Fig 6.3 (No Correlation)**: Points randomly scattered with no pattern
  • **Fig 6.4 (Perfect Positive)**: All points on upward-sloping straight line
  • **Fig 6.5 (Perfect Negative)**: All points on downward-sloping straight line
  • **Fig 6.6-6.7 (Non-linear)**: Points follow curved patterns, not linear
  • **Advantage**: Quick visual understanding without complex calculations.

    **Limitation**: Does not provide precise numerical measure; subjective interpretation possible.

    KARL PEARSON'S COEFFICIENT OF CORRELATION

    **Also known as**: Product-moment correlation coefficient or simple correlation coefficient.

    **Definition**: Karl Pearson's coefficient provides a precise numerical measure of the degree of linear relationship between two variables X and Y.

    Fundamental Concept

    When two variables have a **linear relationship**, their association can be represented by a straight line on a graph. Karl Pearson's coefficient measures both direction (positive/negative) and strength (magnitude) of this linear association.

    Prerequisites for Using Karl Pearson's Coefficient

  • The relationship between variables must be **linear** (representable by straight line)
  • If the true relationship is **non-linear** (curved), using Karl Pearson's coefficient is **misleading and should be avoided**
  • **Best practice**: Always examine scatter diagram first before calculating coefficient
  • Basic Definitions and Formulas

    **Arithmetic Mean**:

  • X̄ = ΣX / N (mean of variable X)
  • Ȳ = ΣY / N (mean of variable Y)
  • **Variance** (measure of spread):

  • σ²ₓ = Σ(X - X̄)² / N = ΣX² / N - (X̄)²
  • σ²ᵧ = Σ(Y - Ȳ)² / N = ΣY² / N - (Ȳ)²
  • **Standard Deviation** (square root of variance):

  • σₓ = √[Σ(X - X̄)² / N]
  • σᵧ = √[Σ(Y - Ȳ)² / N]
  • **Covariance** (measure of joint movement):

  • Cov(X,Y) = Σ(X - X̄)(Y - Ȳ) / N = Σ(xy) / N
  • where x = X - X̄ and y = Y - Ȳ are deviations from respective means.

    Formulas for Correlation Coefficient

    **Formula 1** (using covariance and standard deviations):

    r = [Σ(xy) / N] / (σₓ × σᵧ)

    **Formula 2** (deviation form):

    r = Σ(X - X̄)(Y - Ȳ) / √[Σ(X - X̄)² × Σ(Y - Ȳ)²]

    **Formula 3** (raw score form):

    r = [ΣXY - (ΣX × ΣY) / N] / √[{ΣX² - (ΣX)² / N} × {ΣY² - (ΣY)² / N}]

    **Formula 4** (alternative raw score form):

    r = [N × ΣXY - (ΣX × ΣY)] / √[{N × ΣX² - (ΣX)²} × {N × ΣY² - (ΣY)²}]

    All four formulas yield identical results; choice depends on data form and computational convenience.

    Properties of Correlation Coefficient

    1. **Unitless Measure**: r has no units of measurement. The correlation between height in feet and weight in kilograms is a pure number (e.g., 0.7), independent of measurement units.

    2. **Range**: The value of r always lies between –1 and +1, inclusive:

  • –1 ≤ r ≤ +1
  • Any value outside this range indicates calculation error
  • 3. **Sign Indicates Direction**:

  • Positive r: Variables move in same direction
  • Negative r: Variables move in opposite directions
  • r = 0: No linear relationship
  • 4. **Magnitude Indicates Strength**:

  • |r| close to 1: Strong linear relationship
  • |r| close to 0: Weak linear relationship
  • r = 0: No linear relationship (but non-linear relationship may exist)
  • 5. **Perfect Correlation**:

  • r = +1: Perfect positive correlation (all points on upward-sloping line)
  • r = –1: Perfect negative correlation (all points on downward-sloping line)
  • 6. **Independence from Origin and Scale Change** (Most Important Property):

    If U = (X – A) / B and V = (Y – C) / D, where A and C are assumed means, B and D are common factors of same sign, then:

    **rᵤᵥ = rₓᵧ**

    This property is fundamental to the **step deviation method**, allowing simplified calculations when data values are large.

    Interpretation Guidelines

  • **r = 0.8 to 1.0 (or –0.8 to –1.0)**: Very strong correlation
  • **r = 0.6 to 0.8 (or –0.6 to –0.8)**: Strong correlation
  • **r = 0.4 to 0.6 (or –0.4 to –0.6)**: Moderate correlation
  • **r = 0.2 to 0.4 (or –0.2 to –0.4)**: Weak correlation
  • **r = 0 to 0.2 (or 0 to –0.2)**: Very weak or no correlation
  • Worked Example 1: Years of Schooling and Agricultural Yield

    **Data Table 6.1**:

    | Years of Education (X) | Annual Yield (Rs '000) (Y) |

    |---|---|

    | 0 | 4 |

    | 2 | 4 |

    | 4 | 6 |

    | 6 | 10 |

    | 8 | 10 |

    | 10 | 8 |

    | 12 | 7 |

    **Calculations**:

  • N = 7
  • ΣX = 42, ΣY = 49
  • X̄ = 42/7 = 6, Ȳ = 49/7 = 7
  • Σ(X - X̄)² = 112, Σ(Y - Ȳ)² = 38
  • Σ(X - X̄)(Y - Ȳ) = 42
  • **Using Formula 2**:

    r = 42 / √(112 × 38) = 42 / √4256 = 42 / 65.24 = **0.644**

    **Interpretation**: Positive correlation (0.644) indicates that more years of farmer education are associated with higher annual yield per acre. The moderate-to-strong strength suggests education significantly impacts agricultural productivity. This underscores policy importance of farmer education programs.

    Critical Example: Correlation vs. Causation (Epidemic and Doctors)

    When epidemic spreads to villages, positive correlation between number of deaths and number of doctors sent appears counterintuitive. However, this does NOT mean doctors cause deaths. Reasons:

  • Government sends more doctors to severely affected villages (more deaths)
  • Many deaths are terminal cases where medical help arrives too late
  • Benefits of medical care take time to manifest
  • Data from specific time period doesn't show long-term health improvements
  • **Lesson**: Understanding data context is essential before interpreting correlation. Statistical methods are no substitute for logical reasoning.

    STEP DEVIATION METHOD FOR CORRELATION COEFFICIENT

    When values of X and Y are large, computational burden increases significantly. The **step deviation method** uses the property that correlation is unaffected by change of origin and scale.

    Method

    Transform variables using:

  • U = (X – A) / h
  • V = (Y – C) / k
  • where:

  • A = assumed mean of X (any convenient value, typically close to middle value)
  • C = assumed mean of Y
  • h = common factor for X (typically class interval width, if applicable)
  • k = common factor for Y
  • h and k must have same sign (both positive)
  • Then: **rᵤᵥ = rₓᵧ** (correlation of transformed variables equals original correlation)

    Worked Example 2: Price Index and Money Supply

    **Original Data**:

    | Price Index (X) | Money Supply in Rs Crores (Y) |

    |---|---|

    | 120 | 1800 |

    | 150 | 2000 |

    | 190 | 2500 |

    | 220 | 2700 |

    | 230 | 3000 |

    **Step 1**: Choose A = 100, h = 10, C = 1700, k = 100

    **Step 2**: Calculate transformed values U and V:

    | X | U = (X–100)/10 | Y | V = (Y–1700)/100 |

    |---|---|---|---|

    | 120 | 2 | 1800 | 1 |

    | 150 | 5 | 2000 | 3 |

    | 190 | 9 | 2500 | 8 |

    | 220 | 12 | 2700 | 10 |

    | 230 | 13 | 3000 | 13 |

    **Step 3**: Calculate required values:

  • ΣU = 41, ΣV = 35
  • ΣU² = 423, ΣV² = 343
  • ΣUV = 378, N = 5
  • **Step 4**: Apply formula:

    r = [N·ΣUV – (ΣU·ΣV)] / √[{N·ΣU² – (ΣU)²} × {N·ΣV² – (ΣV)²}]

    r = [5(378) – (41×35)] / √[{5(423) – 41²} × {5(343) – 35²}]

    r = [1890 – 1435] / √[{2115 – 1681} × {1715 – 1225}]

    r = 455 / √[434 × 490] = 455 / √212,660 = 455 / 461.2 = **0.98**

    **Interpretation**: Very strong positive correlation (0.98) between price index and money supply. This is a foundational premise of **monetary policy** — increases in money supply lead to proportional increases in price level. This relationship is central to inflation management and central bank operations.

    Advantages of Step Deviation Method

  • Eliminates large numbers, reducing calculation burden
  • Reduces arithmetic errors in multiplication
  • Results are identical to direct calculation
  • Essential when dealing with data in thousands/millions
  • SPEARMAN'S RANK CORRELATION COEFFICIENT

    **Developed by**: British psychologist C.E. Spearman.

    **Definition**: Spearman's rank correlation measures the linear association between **ranks** assigned to items according to their attributes, rather than their actual numerical values.

    When to Use Spearman's Rank Correlation

    1. **Measurable attributes where measurement is difficult**: When measurement devices are unavailable. Example: Ranking students by height and weight in remote village without measuring rods or scales.

    2. **Non-measurable qualitative attributes**: Variables that cannot be numerically measured directly:

  • Intelligence, fairness, honesty, beauty, leadership qualities
  • These can only be ranked relatively (e.g., "person A is more intelligent than person B")
  • 3. **Non-linear relationships**: When scatter diagram shows curved (non-linear) relationship (Figures 6.6-6.7), Spearman's coefficient is more appropriate than Karl Pearson's.

    4. **Extreme values present**: Spearman's coefficient is **robust against extreme values**. If data contains outliers, Spearman's is superior to Karl Pearson's because it uses ranks rather than actual values.

    Formula for Spearman's Rank Correlation

    rₐ = 1 – [6ΣD² / (n(n² – 1))]

    where:

  • rₐ = Spearman's rank correlation coefficient
  • D = difference in ranks assigned to paired items
  • n = number of observations
  • The formula is derived from Karl Pearson's coefficient by replacing values with ranks
  • Properties of Spearman's Rank Correlation

    1. **Same interpretation as Karl Pearson's coefficient**:

  • rₐ ranges from –1 to +1
  • rₐ = +1: Perfect positive rank correlation
  • rₐ = –1: Perfect negative rank correlation
  • rₐ = 0: No rank correlation
  • 2. **Direction and Strength**: Magnitude and sign have identical interpretation as in Pearson's coefficient.

    3. **Robustness**: Not affected by extreme values because it uses ranks (order information) rather than actual values.

    4. **Applicability**: All interpretation guidelines for correlation strength apply to Spearman's coefficient.

    Advantages Over Karl Pearson's Coefficient

  • Can be used for qualitative variables that cannot be measured numerically
  • Not affected by extreme values or outliers
  • Can work with non-linear but monotonic relationships
  • Simpler conceptual understanding for some applications
  • Advantages of Karl Pearson's Over Spearman's

  • Uses actual values, providing more precise information
  • More statistical power when relationship is truly linear and normally distributed
  • Conveys magnitude of relationship more accurately
  • CORRELATION IN INDIAN ECONOMIC CONTEXT

    Real-World Applications in India

    1. **Agricultural Economics**:

  • Relationship between farmer education years and crop yield (Example 1)
  • Correlation between rainfall patterns and agricultural productivity
  • Market data: Vegetable supply in local mandi and price movements
  • Policy relevance: Investment in farmer education to increase productivity
  • 2. **Monetary Policy**:

  • Strong positive correlation between price index and money supply (Example 2, r = 0.98)
  • RBI uses this relationship to manage inflation through money supply control
  • Understanding this correlation helps predict price movements
  • 3. **Economic Development Indicators**:

  • National income growth and gross domestic savings (Activity Table 6.2)
  • Table shows varied correlations across different years:
  • Economic Survey data reveals relationship patterns for policy formulation
  • 4. **Sectoral Relationships**:

  • Income and consumption patterns across rural-urban divide
  • Interest rates and agricultural loan demand
  • Infrastructure development and industrial growth rates
  • Policy Implications

    Understanding correlation helps policymakers:

  • Identify spurious relationships that mislead policy
  • Design evidence-based interventions
  • Predict economic variables for planning
  • Monitor effectiveness of economic programs
  • Example: If farmer education shows strong positive correlation with yield, investment in agricultural extension services is justified.

    COMMON ERRORS AND MISCONCEPTIONS

    Error 1: Interpreting Correlation as Causation

    **Mistake**: Finding r = 0.8 between variables X and Y and concluding "X causes Y."

    **Correction**: Correlation measures association, not causation. Third variables may cause both. Always examine logical relationship and data context.

    Error 2: Expecting Linear Relationships

    **Mistake**: Calculating Karl Pearson's r for non-linear data.

    **Correction**: First examine scatter diagram. For curved relationships, use Spearman's rank correlation or acknowledge non-linear relationship.

    Error 3: Ignoring Extreme Values

    **Mistake**: Using Karl Pearson's coefficient when data contains outliers.

    **Correction**: Use Spearman's rank correlation or examine if extreme values are data entry errors.

    Error 4: Correlation Value Outside –1 to +1 Range

    **Mistake**: Calculating r = 1.5 or r = –1.2.

    **Correction**: Recheck calculation — error exists. Recompute using correct formula.

    Error 5: Misinterpreting Weak Correlation

    **Mistake**: Assuming r = 0.15 means no relationship exists.

    **Correction**: Linear relationship is weak, but non-linear relationship may be strong.

    SUMMARY OF KEY FORMULAS

    **Mean**: X̄ = ΣX / N

    **Variance**: σ² = Σ(X – X̄)² / N = ΣX² / N – (X̄)²

    **Covariance**: Cov(X,Y) = Σ(X – X̄)(Y – Ȳ) / N

    **Karl Pearson's Coefficient (Formula 1)**:

    r = [Σ(xy) / N] / (σₓ × σᵧ)

    **Karl Pearson's Coefficient (Formula 2)**:

    r = Σ(X – X̄)(Y – Ȳ) / √[Σ(X – X̄)² × Σ(Y – Ȳ)²]

    **Karl Pearson's Coefficient (Formula 4 – Most Useful)**:

    r = [N·ΣXY – (ΣX·ΣY)] / √[{N·ΣX² – (ΣX)²} × {N·ΣY² – (ΣY)²}]

    **Spearman's Rank Correlation**:

    rₐ = 1 – [6ΣD² / (n(n² – 1))]

    EXAMINATION TIPS

    1. **Always draw scatter diagram first** before calculating coefficient.

    2. **For large data values**, use step deviation method to reduce computational burden.

    3. **When reporting r**, mention both direction (positive/negative) and strength (weak/strong/very strong).

    4. **For qualitative variables** (beauty, honesty, intelligence), use only Spearman's rank correlation.

    5. **Check answer reasonableness**: Is –1 ≤ r ≤ +1? Does sign match scatter diagram direction?

    6. **Distinguish correlation from causation** in answer explanations.

    7. **Indian economic examples strengthen answers**: Reference farmer education-yield relationship, agricultural markets, monetary policy applications.

    8. **Show all calculation steps** in detail to earn partial credit if final answer has minor errors.

    MCQs — 10 Questions with Answers

    Q1. Which of the following is the PRIMARY advantage of using a scatter diagram to study correlation?

    • A. It provides an exact numerical value of the correlation coefficient
    • B. It visually displays the direction and intensity of relationship without calculation ✓
    • C. It can only be used when there is perfect correlation
    • D. It eliminates the need for Karl Pearson's coefficient entirely

    Answer: B — A scatter diagram's key advantage is visual inspection of the pattern and closeness of points, showing direction and strength without numerical computation.

    Q2. Which statement correctly distinguishes between positive and negative correlation?

    • A. Positive correlation occurs when both variables increase; negative correlation occurs when both decrease
    • B. Positive correlation means variables move in the same direction; negative correlation means they move in opposite directions ✓
    • C. Positive correlation is stronger than negative correlation
    • D. Negative correlation implies causation while positive correlation does not

    Answer: B — Positive correlation means when X increases Y increases (or both decrease); negative correlation means when X increases Y decreases and vice versa.

    Q3. A scatter diagram shows points scattered randomly with no clear pattern. This indicates:

    • A. Perfect positive correlation
    • B. Perfect negative correlation
    • C. No correlation or very weak correlation ✓
    • D. Strong negative correlation

    Answer: C — Random scatter with no upward or downward trend indicates the variables have no consistent linear relationship or extremely weak correlation.

    Q4. In a mandi, as the supply of tomatoes increases dramatically during harvest season, the price drops from Rs 40/kg to Rs 4/kg. What type of correlation exists between supply and price?

    • A. Positive correlation
    • B. Negative correlation ✓
    • C. Zero correlation
    • D. Perfect positive correlation

    Answer: B — As supply increases (↑), price decreases (↓), showing variables move in opposite directions—this is negative correlation.

    Q5. Which of the following is NOT a valid reason to reject the claim that correlation implies causation?

    • A. A third variable (like temperature) might cause both ice-cream sales and drowning deaths to increase
    • B. The correlation might be pure coincidence (e.g., migratory birds and birth rates)
    • C. The direction of causation could be reversed
    • D. One variable always has a higher numerical value than the other ✓

    Answer: D — The magnitude of variable values has no bearing on whether correlation implies causation; lurking variables, coincidence, and reversed causation are valid reasons to reject causal claims.

    Q6. Karl Pearson's coefficient of correlation should be applied ONLY when:

    • A. The relationship between variables is non-linear
    • B. The relationship between variables is linear ✓
    • C. Both variables are qualitative (non-numerical)
    • D. The sample size is very large (n > 100)

    Answer: B — Karl Pearson's correlation coefficient measures linear relationships; applying it to non-linear data can give misleading results.

    Q7. Which correlation measurement technique is best suited for analyzing the relationship between students' physical appearance and their academic performance?

    • A. Karl Pearson's coefficient of correlation
    • B. Scatter diagram with raw scores
    • C. Spearman's rank correlation ✓
    • D. Simple numerical mean of both variables

    Answer: C — Physical appearance is a non-numerical attribute that must be ranked; Spearman's rank correlation is designed for ranking such qualitative variables.

    Q8. Which of the following statements about perfect correlation is CORRECT? (A) Perfect positive correlation means r = +1 and all points lie on an upward-sloping line. (B) Perfect negative correlation means r = −1 and all points lie on a downward-sloping line. Choose the correct option:

    • A. Only (A) is correct
    • B. Only (B) is correct
    • C. Both (A) and (B) are correct ✓
    • D. Neither (A) nor (B) is correct

    Answer: C — Both statements are accurate: r = +1 indicates perfect positive correlation (upward line) and r = −1 indicates perfect negative correlation (downward line).

    Q9. Study the following scenario: A researcher finds a strong positive correlation (r = 0.92) between the number of firefighters at a fire scene and the total fire damage caused. Which is the MOST appropriate conclusion?

    • A. More firefighters cause greater fire damage
    • B. More fire damage causes more firefighters to be called to the scene
    • C. A third variable (size/severity of fire) causes both more firefighters and more damage ✓
    • D. There is no relationship between firefighters and damage; the correlation is meaningless

    Answer: C — This is a classic lurking variable example: larger fires cause both more damage and attract more firefighters; correlation does not imply the firefighters cause damage.

    Q10. In India's agricultural sector, a government economist observes that as rainfall increases, agricultural productivity increases (positive correlation r = +0.78). However, an economist argues that correlation here is misleading because rainfall might not be the true cause. Which factor could the second economist be hinting at?

    • A. The sample size is too small
    • B. Soil fertility, irrigation availability, or seed quality might be the actual drivers of productivity independent of rainfall ✓
    • C. Correlation values above 0.7 are always caused by one variable acting on the other
    • D. Productivity always increases linearly with rainfall in all regions

    Answer: B — The economist is identifying lurking variables (soil fertility, irrigation, seeds) that could influence productivity independent of rainfall, showing why high correlation doesn't prove causation.

    Flashcards

    What does correlation measure?

    Correlation measures the direction and intensity (strength) of the linear relationship between two variables, not causation.

    Define positive correlation with an example.

    When both variables move in the same direction (both increase or both decrease), like income and consumption rising together.

    Define negative correlation with an example.

    When variables move in opposite directions—as one increases, the other decreases, like price of apples and quantity demanded.

    Why is the ice-cream and drowning death relationship NOT causal?

    Both increase due to a third variable (rising temperature), not because ice-cream causes drowning; this is called a lurking variable.

    What does a scatter diagram show about correlation?

    It visually displays the direction and strength of a relationship: points on a line indicate strong/perfect correlation, scattered points indicate weak or no correlation.

    What is perfect positive correlation?

    All data points lie exactly on an upward-sloping straight line, showing a complete one-to-one positive relationship.

    What is perfect negative correlation?

    All data points lie exactly on a downward-sloping straight line, showing a complete one-to-one inverse relationship.

    When should Karl Pearson's correlation coefficient be used?

    Only when the relationship between two variables is linear (can be represented by a straight line).

    What type of variables does Spearman's rank correlation measure?

    It measures correlation between ranks of non-numerical attributes like intelligence, honesty, or physical appearance.

    Why is correlation different from causation?

    Correlation shows that two variables move together, but it does not prove one causes the other; there may be a third variable or pure coincidence.

    Important Board Questions

    Define correlation and explain why correlation cannot be interpreted as causation with one example from the study material. [2 marks]

    State the definition of correlation (covariation, not causation). Use the ice-cream/drowning example or migratory birds/birth rates to show how a third variable or coincidence breaks the causal chain.

    Explain the difference between positive and negative correlation. Draw or describe what a scatter diagram would look like for each type. Provide one economic example of each type of correlation. [5 marks]

    Positive: both variables move in same direction (describe upward-sloping pattern on scatter). Negative: opposite directions (downward slope). Example for positive—income and consumption. Example for negative—price and demand. Show visual pattern clearly.

    The following scatter plot shows the relationship between study hours (X-axis) and examination scores (Y-axis) for Class 11 students. The points form a clear upward-sloping line with one outlier. (a) Identify the type of correlation shown. (b) Explain why this correlation might be misleading or incomplete as evidence that more study hours CAUSE higher scores. (c) What third variable might actually be driving both study hours and examination scores? [6 marks]

    Part (a): Positive correlation. Part (b): Correlation ≠ causation; lurking variables may exist. Part (c): Intelligence, aptitude, or prior subject knowledge could drive both effort and performance independently. Discuss the concept of covariation versus causation and why scatter diagrams alone cannot prove cause-effect.

    Next chapterIndex Numbers →

    Practice with interactive flashcards, mind maps, upload your own chapters and get AI study kits instantly

    Try StudyOS Free →