📚 StudyOS CBSE Class 5–12 AI Tutor

Measures of Central Tendency

NCERT Class 11 · Economics Based on NCERT Class 11 Economics textbook · Free CBSE study kit

Chapter Notes

MEASURES OF CENTRAL TENDENCY

Introduction and Need for Central Tendency

**Central tendency** is a numerical method to summarise a large set of data by representing it with a single typical or representative value. The purpose is to condense vast information into one meaningful number that captures the essential characteristic of the data.

**Real-life examples where central tendency is used:**

  • Average marks obtained by students in a class test
  • Average rainfall in a region
  • Average production in a factory
  • Average income of workers in a firm
  • Average land holding size among farmers
  • **Why is summarisation needed?**

    When dealing with large datasets, it becomes impossible to draw meaningful conclusions by examining each individual observation. A single value that represents the entire dataset helps in:

  • Quick comparison between different groups
  • Identifying the typical or representative value
  • Making policy decisions based on aggregated information
  • Understanding the economic condition of a population
  • **Case Study: Baiju's Farm**

    Baiju is a farmer in Balapur village, Buxar district, Bihar. To evaluate Baiju's economic condition relative to other small farmers, we would need to:

    1. Compare his land holding with the **arithmetic mean** (average size) of all farmers' holdings

    2. Check if his holding is above the size that half the farmers own (using **median**)

    3. Determine if his holding is above what most farmers own (using **mode**)

    This example illustrates why we need different types of averages for different analytical purposes.

    ---

    ARITHMETIC MEAN (A.M.)

    **Definition:** The arithmetic mean is the sum of all values of observations divided by the total number of observations. It is the most commonly used measure of central tendency and is denoted by **X̄** (X-bar).

    **Formula for Ungrouped Data:**

    **X̄ = ΣX / N**

    Where:

  • ΣX = sum of all observations
  • N = total number of observations
  • Σ = summation symbol (sum of)
  • **General Form:**

    X̄ = (X₁ + X₂ + X₃ + ... + Xₙ) / N

    **Simple Example:**

    If six families have monthly incomes of Rs 1600, 1500, 1400, 1525, 1625, and 1630:

    X̄ = (1600 + 1500 + 1400 + 1525 + 1625 + 1630) / 6 = Rs 1,547

    This means the average family income is Rs 1,547.

    ---

    Arithmetic Mean for Ungrouped Data – Direct Method

    **Method:** Sum all observations and divide by the total number of observations.

    **Steps:**

    1. Add all the values in the dataset

    2. Count the total number of observations

    3. Divide the sum by the count

    **Example 1: Marks of Students**

    Calculate arithmetic mean from marks: 40, 50, 55, 78, 58

    X̄ = (40 + 50 + 55 + 78 + 58) / 5 = 281 / 5 = **56.2 marks**

    The average mark is 56.2.

    **When to use:** When the number of observations is small and/or figures are small numbers.

    ---

    Arithmetic Mean – Assumed Mean Method

    **Purpose:** This method simplifies calculation when there are many observations or large numerical values.

    **Logic:** Instead of adding all observations directly, we:

    1. Assume a value (A) from the data as the mean

    2. Calculate deviations (d) of each observation from this assumed mean

    3. Find the average deviation and add it to the assumed mean

    **Formula:**

    **X̄ = A + (Σd / N)**

    Where:

  • A = assumed mean (any value, preferably centrally located in data)
  • d = deviation from assumed mean = (X - A)
  • Σd = sum of all deviations
  • N = number of observations
  • **Selection of Assumed Mean:** Choose a value that appears in the middle of the data to minimize deviation calculations and avoid large numbers.

    **Example 2: Weekly Income of Families**

    Family: A, B, C, D, E, F, G, H, I, J

    Income: 850, 700, 100, 750, 5000, 80, 420, 2500, 400, 360

    **Solution:**

    Assume A = 850 (centrally located)

    | Family | Income (X) | d = X - 850 |

    |--------|-----------|-----------|

    | A | 850 | 0 |

    | B | 700 | -150 |

    | C | 100 | -750 |

    | D | 750 | -100 |

    | E | 5000 | +4150 |

    | F | 80 | -770 |

    | G | 420 | -430 |

    | H | 2500 | +1650 |

    | I | 400 | -450 |

    | J | 360 | -490 |

    | **Total** | 11,160 | **+2,660** |

    X̄ = 850 + (2,660 / 10) = 850 + 266 = **Rs 1,116**

    ---

    Arithmetic Mean – Step Deviation Method

    **Purpose:** Further simplifies calculations by dividing deviations by a common factor, avoiding large numbers.

    **Method:** When deviations (d) are large, divide them by a common factor (c) to get d'.

    **Formula:**

    **X̄ = A + (Σd' / N) × c**

    Where:

  • d' = d / c = (X - A) / c
  • c = common factor (usually 10, 100, etc., or the class width in grouped data)
  • A = assumed mean
  • N = number of observations
  • **Advantage:** Reduces computational burden by working with smaller numbers.

    **Example (from Example 2):**

    If we divide deviations by 10:

    | Family | Income (X) | d = X - 850 | d' = d/10 |

    |--------|-----------|-----------|---------|

    | A | 850 | 0 | 0 |

    | B | 700 | -150 | -15 |

    | C | 100 | -750 | -75 |

    | D | 750 | -100 | -10 |

    | E | 5000 | +4150 | +415 |

    | F | 80 | -770 | -77 |

    | G | 420 | -430 | -43 |

    | H | 2500 | +1650 | +165 |

    | I | 400 | -450 | -45 |

    | J | 360 | -490 | -49 |

    | **Total** | | | **+266** |

    X̄ = 850 + (266/10) × 10 = 850 + 266 = **Rs 1,116**

    ---

    Arithmetic Mean for Grouped Data – Discrete Series

    **Discrete Series:** Data presented as values with their corresponding frequencies.

    **Direct Method:**

    **X̄ = ΣfX / Σf**

    Where:

  • f = frequency of each observation
  • X = value of observation
  • ΣfX = sum of (frequency × value) for all observations
  • Σf = total of all frequencies
  • **Steps:**

    1. Multiply each value (X) by its frequency (f) to get fX

    2. Sum all fX values to get ΣfX

    3. Sum all frequencies to get Σf

    4. Divide ΣfX by Σf

    **Example 3: Plot Sizes in Housing Colony**

    Plots come in three sizes: 100 sq. m, 200 sq. m, 300 sq. m with frequencies 200, 50, 10 respectively.

    | Plot Size (X) | Frequency (f) | fX |

    |--------------|--------------|------|

    | 100 | 200 | 20,000 |

    | 200 | 50 | 10,000 |

    | 300 | 10 | 3,000 |

    | **Total** | **260** | **33,000** |

    X̄ = 33,000 / 260 = **126.92 sq. metres**

    The average plot size in the colony is 126.92 sq. metres.

    **Assumed Mean Method for Discrete Series:**

    **X̄ = A + (Σfd / Σf)**

    Where d = X - A, and we weight each deviation by its frequency (fd).

    **Step Deviation Method for Discrete Series:**

    **X̄ = A + (Σfd' / Σf) × c**

    Where d' = (X - A) / c

    ---

    Arithmetic Mean for Grouped Data – Continuous Series

    **Continuous Series:** Data presented in class intervals with frequencies.

    **Key Difference:** Use the **midpoint (m)** of each class interval instead of the actual values.

    **Midpoint Formula:** m = (Lower limit + Upper limit) / 2

    **Direct Method:**

    **X̄ = Σfm / Σf**

    Where:

  • m = midpoint of class interval
  • f = frequency of class
  • Σfm = sum of (frequency × midpoint)
  • Σf = total frequency
  • **Example 4: Average Marks (Exclusive Class Intervals)**

    | Marks | Frequency (f) | Midpoint (m) | fm |

    |-------|--------------|-------------|------|

    | 0–10 | 5 | 5 | 25 |

    | 10–20 | 12 | 15 | 180 |

    | 20–30 | 15 | 25 | 375 |

    | 30–40 | 25 | 35 | 875 |

    | 40–50 | 8 | 45 | 360 |

    | 50–60 | 3 | 55 | 165 |

    | 60–70 | 2 | 65 | 130 |

    | **Total** | **70** | | **2,110** |

    X̄ = 2,110 / 70 = **30.14 marks**

    **Step Deviation Method for Continuous Series:**

    Choose assumed mean (A) = 35, common factor (c) = 10 (class width)

    **X̄ = A + (Σfd' / Σf) × c**

    | Marks | f | m | d' = (m-35)/10 | fd' |

    |-------|--|---|----------------|-----|

    | 0–10 | 5 | 5 | -3 | -15 |

    | 10–20 | 12 | 15 | -2 | -24 |

    | 20–30 | 15 | 25 | -1 | -15 |

    | 30–40 | 25 | 35 | 0 | 0 |

    | 40–50 | 8 | 45 | 1 | 8 |

    | 50–60 | 3 | 55 | 2 | 6 |

    | 60–70 | 2 | 65 | 3 | 6 |

    | **Total** | **70** | | | **-34** |

    X̄ = 35 + (-34/70) × 10 = 35 - 4.86 = **30.14 marks**

    ---

    Weighted Arithmetic Mean

    **Purpose:** When different values have different importance or weights, we calculate the weighted mean.

    **Formula:**

    **X̄w = (W₁X₁ + W₂X₂ + ... + WₙXₙ) / (W₁ + W₂ + ... + Wₙ)**

    Or simply: **X̄w = ΣWX / ΣW**

    Where:

  • W = weight assigned to each value
  • X = value
  • ΣWX = sum of (weight × value)
  • ΣW = sum of all weights
  • **Practical Example:**

    A consumer purchases mangoes at Rs 20/kg and potatoes at Rs 8/kg. Mangoes comprise 30% of the budget, potatoes 70%.

    Weighted average price = (0.30 × 20 + 0.70 × 8) / (0.30 + 0.70) = (6 + 5.6) / 1 = **Rs 11.60 per kg**

    This is more realistic than simple average (20 + 8)/2 = 14 because it reflects the consumer's actual spending pattern.

    ---

    Important Properties of Arithmetic Mean

    **Property 1: Sum of Deviations from Mean is Zero**

    **Σ(X - X̄) = 0**

    This means if we calculate how much each observation differs from the mean and sum these differences, the result is always zero.

    **Example:** Data: 4, 6, 8, 10, 12

    Mean = (4 + 6 + 8 + 10 + 12) / 5 = 8

    Deviations from 8: (4-8), (6-8), (8-8), (10-8), (12-8) = -4, -2, 0, +2, +4

    Sum = -4 - 2 + 0 + 2 + 4 = **0**

    **Property 2: Arithmetic Mean is Affected by Extreme Values**

    The A.M. is highly sensitive to outliers or extreme values. If one very large or very small value is added to the dataset, the mean shifts significantly.

    **Example:** Data: 1, 2, 3 → Mean = 2

    Data: 1, 2, 300 → Mean = 101

    The inclusion of 300 dramatically increases the mean, even though most values are small. This is a major limitation of the mean as a measure of central tendency.

    **Advantages of Arithmetic Mean:**

  • Easy to calculate and understand
  • Uses all values in the dataset
  • Can be used for further statistical analysis
  • Unique for any dataset
  • **Disadvantages of Arithmetic Mean:**

  • Affected by extreme values (outliers)
  • May not represent the actual value in the dataset
  • Unsuitable when data has open-ended classes
  • Less useful for skewed distributions
  • ---

    MEDIAN

    **Definition:** The **median** is the positional value that divides the distribution into two equal parts. It is the middle value when data is arranged in ascending or descending order.

    **Key Characteristic:** The median is based on position, not magnitude of values, making it **unaffected by extreme values**.

    **Conceptual Importance:** While the mean represents the average, the median represents the typical middle observation. It shows what value exactly 50% of the observations fall below and 50% fall above.

    ---

    Computation of Median – Ungrouped Data

    **Steps:**

    1. Arrange all observations in ascending order (from smallest to largest)

    2. Find the middle position using: **Position = (N + 1) / 2**

    3. The observation at this position is the median

    **Case 1: Odd Number of Observations**

    When N is odd, there is exactly one middle observation.

    **Example 5: Simple Dataset**

    Data: 5, 7, 6, 1, 8, 10, 12, 4, 3

    **Step 1:** Arrange in ascending order: 1, 3, 4, 5, **6**, 7, 8, 10, 12

    **Step 2:** Position = (9 + 1) / 2 = 5th position

    **Step 3:** The 5th observation is **6**

    **Median = 6**

    Interpretation: Half the values (1, 3, 4, 5) are below 6, and half (7, 8, 10, 12) are above 6.

    **Case 2: Even Number of Observations**

    When N is even, there are two middle observations. The median is the average of these two middle values.

    **Example 6: Even-Sized Dataset**

    Data (20 students' marks): 25, 72, 28, 65, 29, 60, 30, 54, 32, 53, 33, 52, 35, 51, 42, 48, 45, 47, 46, 33

    **Step 1:** Arrange in ascending order: 25, 28, 29, 30, 32, 33, 33, 35, 42, **45, 46**, 47, 48, 51, 52, 53, 54, 60, 65, 72

    **Step 2:** Position = (20 + 1) / 2 = 10.5th position

    This means the median lies between the 10th and 11th observations.

    **Step 3:** The 10th value is 45 and 11th value is 46

    **Median = (45 + 46) / 2 = 45.5 marks**

    Interpretation: Exactly 50% of students scored ≤ 45.5 and 50% scored ≥ 45.5 marks.

    ---

    Median for Discrete Series

    **Method:** Use cumulative frequency (cf) to locate the position (N+1)/2 and identify the corresponding value.

    **Steps:**

    1. Prepare cumulative frequency column

    2. Calculate position = (N + 1) / 2

    3. Find this position in the cumulative frequency column

    4. The value corresponding to this cf is the median

    **Example 7: Income Distribution**

    | Income (Rs) | Frequency (f) | Cumulative Frequency (cf) |

    |------------|--------------|--------------------------|

    | 10 | 2 | 2 |

    | 20 | 4 | 6 |

    | 30 | 10 | 16 |

    | 40 | 4 | 20 |

    | **Total** | **N=20** | |

    **Step 1:** Position of median = (20 + 1) / 2 = 10.5th item

    **Step 2:** Look at cumulative frequency: 2, 6, **16**, 20

    The 10.5th position falls in the cf of 16.

    **Step 3:** The income corresponding to cf = 16 is **Rs 30**

    **Median = Rs 30**

    Interpretation: 50% of persons earn Rs 30 or less, and 50% earn Rs 30 or more.

    ---

    Median for Continuous Series

    **Method:** Use the **N/2** position (not (N+1)/2) and apply the median formula.

    **Steps:**

    1. Prepare cumulative frequency table

    2. Calculate N/2 position

    3. Identify the **median class** (the class containing the N/2th observation)

    4. Apply the median formula

    **Median Formula:**

    **Median = L + [(N/2 - cf) / f] × h**

    Where:

  • **L** = lower limit of the median class
  • **N** = total frequency
  • **cf** = cumulative frequency of the class before the median class
  • **f** = frequency of the median class
  • **h** = class width (magnitude of the class interval)
  • **Example 8: Daily Wages**

    | Daily Wages (Rs) | Workers (f) | Cumulative Frequency (cf) |

    |-----------------|------------|--------------------------|

    | 20–25 | 14 | 14 |

    | 25–30 | 28 | 42 |

    | 30–35 | 33 | 75 |

    | 35–40 | 30 | 105 |

    | 40–45 | 20 | 125 |

    | 45–50 | 15 | 140 |

    | 50–55 | 13 | 153 |

    | 55–60 | 7 | 160 |

    | **Total** | **N=160** | |

    **Step 1:** N/2 = 160/2 = 80

    **Step 2:** Find where 80 falls in cf: The cf sequence is 14, 42, 75, **105**, 153, ...

    The 80th observation falls in the **35–40 class**.

    **Step 3:** For the median class 35–40:

  • L = 35
  • cf (before median class) = 75
  • f = 30
  • h = 40 - 35 = 5
  • **Step 4:** Apply formula:

    Median = 35 + [(80 - 75) / 30] × 5 = 35 + (5/30) × 5 = 35 + 0.83 = **Rs 35.83**

    **Interpretation:** Exactly 50% of workers earn Rs 35.83 or less, and 50% earn Rs 35.83 or more.

    ---

    Comparison: Mean vs Median vs Outliers

    **Activity Observation:**

    | Series | Values | Mean | Median | Observation |

    |--------|--------|------|--------|------------|

    | A | 1, 2, 3 | 2 | 2 | Same |

    | B | 1, 2, 30 | 11 | 2 | Mean shifted by outlier |

    | C | 1, 2, 300 | 101 | 2 | Mean much higher |

    | D | 1, 2, 3000 | 1001 | 2 | Mean extremely high |

    **Key Finding:** The median remains **2** in all cases, while the mean increases dramatically with outliers. This proves the median is **resistant to extreme values**.

    **Outliers:** Values that are extremely different from other observations in the dataset. They can distort the mean but cannot affect the median significantly.

    **When to Use Which:**

  • **Use Mean:** When data is fairly distributed without extreme values (normal distribution)
  • **Use Median:** When data has outliers or is skewed (unequal distribution); in income data, property prices, test scores with exceptional performers
  • **Use Mode:** When interested in the most frequent value
  • ---

    Quartiles

    **Definition:** **Quartiles** are positional measures that divide the entire distribution into four equal parts, each containing 25% of the observations.

    **Three Quartiles:**

    **Q₁ (First Quartile/Lower Quartile):**

  • 25% of observations lie below Q₁
  • 75% of observations lie above Q₁
  • Divides the data at the 25th percentile
  • **Q₂ (Second Quartile/Median):**

  • 50% of observations lie below Q₂
  • 50% of observations lie above Q₂
  • This is the median we already studied
  • **Q₃ (Third Quartile/Upper Quartile):**

  • 75% of observations lie below Q₃
  • 25% of observations lie above Q₃
  • Divides the data at the 75th percentile
  • **Interquartile Range (IQR):** The range between Q₁ and Q₃ contains the central 50% of the data. IQR = Q₃ - Q₁

    ---

    Calculation of Quartiles

    **Formula for Ungrouped/Discrete Data:**

    **Q₁ = Size of [(N + 1) / 4]th item**

    **Q₃ = Size of [3(N + 1) / 4]th item**

    **Method:**

    1. Arrange data in ascending order

    2. Apply the position formula

    3. If position is a whole number, that observation is the quartile

    4. If position is decimal (like 2.75), interpolate between the two surrounding observations

    **Example 9: Lower Quartile Calculation**

    Data (10 students' marks): 22, 26, 14, 30, 18, 11, 35, 41, 12, 32

    **Step 1:** Arrange in ascending order: 11, 12, 14, 18, 22, 26, 30, 32, 35, 41

    **Step 2:** Q₁ position = (10 + 1) / 4 = 11/4 = 2.75th item

    **Step 3:** The 2.75th position means:

  • 75% of the way between 2nd and 3rd items
  • 2nd item = 12
  • 3rd item = 14
  • Q₁ = 12 + 0.75(14 - 12) = 12 + 1.5 = **13.5 marks**
  • This means 25% of students scored 13.5 marks or less.

    **Finding Q₃:**

    Q₃ position = 3(10 + 1) / 4 = 33/4 = 8.25th item

  • 8th item = 32
  • 9th item = 35
  • Q₃ = 32 + 0.25(35 - 32) = 32 + 0.75 = **32.75 marks**
  • This means 75% of students scored 32.75 marks or less.

    **Interpretation:** The central 50% of students scored between 13.5 and 32.75 marks.

    ---

    Percentiles

    **Definition:** **Percentiles** are positional measures that divide the distribution into 100 equal parts. There are 99 dividing positions: P₁, P₂, P₃, ..., P₉₉.

    **Key Percentiles:**

  • **P₅₀ = Median** (50th percentile divides at the middle)
  • **P₂₅ = Q₁** (First quartile)
  • **P₇₅ = Q₃** (Third quartile)
  • **Interpretation:** If you score at the **82nd percentile** on an examination:

  • Your position is BELOW 18% of total candidates
  • You performed BETTER than 82% of candidates who took the exam
  • If 1,00,000 students appeared, you rank above 82,000 students
  • **Real-Life Application in India:**

  • NEET, JEE Main, CAT percentile scores
  • SAT/GRE/GMAT percentiles used for university admissions
  • Income percentiles for wealth distribution analysis (lower income percentiles vs upper income percentiles)
  • ---

    MODE

    **Definition:** The **mode** is the value that occurs with the highest frequency in a dataset. It is the most typical or most fashionable value around which maximum concentration of items occurs. Denoted by **Mo**.

    **Derived from:** French word "la Mode" meaning the most fashionable value.

    **Practical Applications:**

  • A shoe manufacturer wants to know which size has the highest demand
  • A clothing retailer wants the most popular shirt style
  • A transport company wants to know the most common vehicle type customers request
  • Government wants to know the most common age group in the population
  • ---

    Mode for Discrete Series

    **Method:** Identify the observation with the highest frequency.

    **Example 10: Size Frequency Distribution**

    | Variable (Size) | 10 | 20 | 30 | 40 | 50 |

    |-----------------|----|----|----|----|-----|

    | Frequency | 2 | 8 | 20 | 10 | 5 |

    **Highest frequency = 20**

    **Corresponding value = 30**

    **Mode = 30**

    Interpretation: The size 30 is the most frequently occurring, so it has the highest demand.

    ---

    Types of Frequency Distributions

    **Unimodal Distribution:**

  • Only one mode exists (one value with highest frequency)
  • Example above is unimodal with Mode = 30
  • **Bimodal Distribution:**

  • Two values with equal highest frequency
  • Example: Data 1, 1, 2, 2, 3, 4, 5
  • Both 1 and 2 appear twice (highest frequency)
  • Modes = 1 and 2
  • **Multimodal Distribution:**

  • More than two values with equal highest frequency
  • Example: 1, 1, 2, 2, 3, 3, 4
  • Modes = 1, 2, 3 (each appears twice)
  • **No Mode:**

  • All values appear with equal frequency
  • Example: 1, 1, 2, 2, 3, 3, 4, 4
  • No unique mode exists
  • ---

    Mode for Continuous Series

    **Method:** Identify the **modal class** (class with highest frequency), then use the mode formula.

    **Modal Class:** The class interval with the maximum frequency.

    **Mode Formula:**

    **Mode = L + [f₁ - f₀ / 2f₁ - f₀ - f₂] × h**

    Or alternatively:

    **Mode = L + [(f₁ - f₀) / (2f₁ - f₀ - f₂)] × h**

    Where:

  • **L** = lower limit of the modal class
  • **f₁** = frequency of the modal class
  • **f₀** = frequency of the class before the modal class (preceding class)
  • **f₂** = frequency of the class after the modal class (succeeding class)
  • **h** = class width (magnitude of class interval)
  • **Assumptions:**

  • Modal class is the class with maximum frequency
  • Frequencies should not have extreme variations
  • ---

    Calculation Steps for Mode in Continuous Series

    1. Identify the class with the highest frequency → this is the modal class

    2. Note down L, f₁, f₀, f₂, and h from the table

    3. Substitute values in the mode formula

    4. Calculate and obtain the mode value

    **Example (Hypothetical):**

    | Marks | 0–10 | 10–20 | 20–30 | 30–40 | 40–50 |

    |-------|------|-------|-------|-------|-------|

    | Frequency | 5 | 12 | **25** | 20 | 8 |

    **Modal Class:** 20–30 (highest frequency = 25)

  • L = 20
  • f₁ = 25
  • f₀ = 12
  • f₂ = 20
  • h = 10
  • Mode = 20 + [(25 - 12) / (2×25 - 12 - 20)] × 10

    Mode = 20 + [13 / (50 - 32)] × 10

    Mode = 20 + [13 / 18] × 10

    Mode = 20 + 7.22 = **27.22 marks**

    ---

    Comparison of Arithmetic Mean, Median, and Mode

    | Feature | Mean | Median | Mode |

    |---------|------|--------|------|

    | **Definition** | Sum/Total | Middle value | Highest frequency |

    | **Position-based** | No | Yes | Yes |

    | **Affected by Outliers** | Yes | No | No |

    | **Unique Value** | Always | Always | May have multiple |

    | **Calculation** | Uses all values | Uses position only | Uses frequency |

    | **Best for** | Normal distribution | Skewed data | Categorical data |

    | **Practical Use** | Average salary | Median house price | Popular product size |

    ---

    Advantages and Disadvantages of Mode

    **Advantages:**

  • Easy to understand and calculate
  • Not affected by extreme values
  • Can be used for both numerical and categorical data
  • Useful for discrete series and distributions with clear peaks
  • Represents the most typical or fashionable value
  • **Disadvantages:**

  • May not exist or may have multiple modes
  • Less useful when all values have equal frequency
  • Cannot be used for further statistical analysis easily
  • Less stable than mean and median
  • Inadequate for numerical comparison in some cases
  • ---

    SUMMARY AND EXAM PREPARATION NOTES

    When to Use Which Average

    **Use Arithmetic Mean when:**

  • Data is symmetrical without extreme values
  • Need for further statistical calculations
  • Dealing with numerical continuous data
  • Policy decisions based on average income, expenditure, production
  • **Use Median when:**

  • Data contains outliers or extreme values
  • Data is skewed (asymmetrical)
  • Dealing with income, property prices, wealth distribution
  • Need to find the middle value that is unaffected by extremes
  • **Use Mode when:**

  • Need to identify the most popular/fashionable item
  • Dealing with categorical data (sizes, colors, brands)
  • Analyzing consumer preferences
  • Distribution has a clear peak
  • Key Formulas to Remember

    1. **Arithmetic Mean (ungrouped):** X̄ = ΣX / N

    2. **Arithmetic Mean (grouped):** X̄ = ΣfX / Σf

    3. **Assumed Mean Method:** X̄ = A + (Σd / N)

    4. **Step Deviation Method:** X̄ = A + [(Σd' / N) × c]

    5. **Weighted Mean:** X̄w = ΣWX / ΣW

    6. **Median Position:** (N + 1) / 2 for ungrouped; N / 2 for grouped

    7. **Median Formula (continuous):** Median = L + [(N/2 - cf) / f] × h

    8. **Q₁ Position:** (N + 1) / 4

    9. **Q₃ Position:** 3(N + 1) / 4

    10. **Mode Formula (continuous):** Mode = L + [(f₁ - f₀) / (2f₁ - f₀ - f₂)] × h

    Important Properties

  • **Sum of deviations from mean = 0:** Σ(X - X̄) = 0
  • **Mean is sensitive to all values and extreme values**
  • **Median is position-based and resistant to
  • MCQs — 10 Questions with Answers

    Q1. What is the Arithmetic Mean of the following marks: 40, 50, 55, 78, 58?

    • A. 56.2 ✓
    • B. 55.0
    • C. 58.0
    • D. 60.0

    Answer: A — Sum = 40 + 50 + 55 + 78 + 58 = 281; Mean = 281 ÷ 5 = 56.2.

    Q2. In the Assumed Mean method, if A = 850 and d = X – A = –150, what is the value of X?

    • A. 700 ✓
    • B. 750
    • C. 800
    • D. 900

    Answer: A — Since d = X – A, then X = A + d = 850 + (–150) = 700.

    Q3. Which method is most suitable when data contains very large numerical figures?

    • A. Direct Method
    • B. Assumed Mean Method or Step Deviation Method ✓
    • C. Only Direct Method can handle large figures
    • D. Median calculation is better

    Answer: B — Assumed Mean and Step Deviation methods reduce computational complexity by working with smaller deviation values instead of large original figures.

    Q4. For grouped discrete data with frequencies, the correct formula for Arithmetic Mean is:

    • A. X = ΣX / N
    • B. X = Σf / ΣX
    • C. X = ΣfX / Σf ✓
    • D. X = ΣX / Σf

    Answer: C — In grouped data, each value X must be weighted by its frequency f; thus X = ΣfX / Σf is the correct formula.

    Q5. A housing colony has 200 plots of 100 sq.m, 50 plots of 200 sq.m, and 10 plots of 300 sq.m. What is the mean plot size?

    • A. 126.92 sq.m ✓
    • B. 200 sq.m
    • C. 150 sq.m
    • D. 175 sq.m

    Answer: A — Using ΣfX / Σf = (200×100 + 50×200 + 10×300) / (200 + 50 + 10) = 33,000 / 260 = 126.92 sq.m.

    Q6. In the Step Deviation method with c = 10, if Σd' = 266 and N = 10, the adjustment to assumed mean is:

    • A. 26.6 ✓
    • B. 266
    • C. 2.66
    • D. 0.266

    Answer: A — The adjustment = (Σd' / N) × c = (266 / 10) × 10 = 26.6, which is added to the assumed mean.

    Q7. Which statement is CORRECT about central tendency?

    • A. The same average (mean, median, or mode) works best for all datasets
    • B. Central tendency always removes all information about data variation
    • C. Central tendency summarizes a large dataset into a single representative value ✓
    • D. Central tendency can only be calculated for grouped data

    Answer: C — Central tendency is designed to represent an entire dataset with one typical value; it works for both ungrouped and grouped data, but different averages suit different situations.

    Q8. Assertion (A): The Assumed Mean method always gives a different result than the Direct method. Reason (R): Both methods calculate the same mean but use different calculation paths.

    • A. A is true, R is true, R explains A
    • B. A is true, R is true, R does not explain A
    • C. A is false, R is true ✓
    • D. A is true, R is false

    Answer: C — Both methods calculate the identical mean value; the Assumed Mean method simply provides a computational shortcut. The assertion is false because results are the same, not different.

    Q9. If the assumed mean is set too far from the actual data center, which problem occurs in calculations?

    • A. The final mean becomes incorrect
    • B. Deviations become very large, increasing calculation complexity ✓
    • C. Frequency values change automatically
    • D. The Step Deviation method cannot be applied

    Answer: B — A poor choice of assumed mean far from the data's center produces large deviations (d values), making arithmetic more cumbersome, though the final result remains correct if calculated properly.

    Q10. In Baiju's village example, comparing his 1-acre holding to the average holding of 50 farmers helps determine which economic aspect?

    • A. His absolute land area in acres
    • B. His relative economic position compared to the village average ✓
    • C. The total village population
    • D. The government subsidy he will receive

    Answer: B — The comparison of Baiju's holding to the average (central tendency) reveals whether he is above, below, or at the typical economic level of his village, showing his relative standing.

    Flashcards

    What is the formula for Arithmetic Mean in ungrouped data?

    X = ΣX / N, where ΣX is the sum of all observations and N is the total number of observations.

    When should the Assumed Mean method be used instead of Direct method?

    When the data contains a large number of observations or very large numerical figures that make direct calculation tedious.

    What is a deviation (d) in the Assumed Mean method?

    Deviation d = X – A, where X is an individual observation and A is the assumed mean.

    How does the Step Deviation method differ from the Assumed Mean method?

    Step Deviation divides all deviations by a common factor c to simplify calculations: d' = d / c, avoiding large numbers.

    What is the formula for Arithmetic Mean in grouped discrete data?

    X = ΣfX / Σf, where f is frequency and X is the observation value.

    Why is the central tendency called a 'representative value'?

    Because one single number summarizes an entire dataset in a way that represents all observations collectively.

    In the Baiju example, why do we need three averages (mean, median, mode) and not just one?

    Because each average tells a different story—mean shows mathematical average, median shows middle position, and mode shows most common holding size.

    What does the summation symbol Σ mean?

    It means 'sum of' or 'add up all'—for example, ΣX means add all X values together.

    In Example 2, why was 850 chosen as the assumed mean?

    Because 850 is a centrally located value in the data, which simplifies calculations and reduces the size of deviations.

    What is the key difference between ungrouped and grouped data in calculating mean?

    Ungrouped data uses X values directly, while grouped data must multiply each X by its frequency f before summing.

    Important Board Questions

    Define Arithmetic Mean and write its formula for ungrouped data. Give one real-life example. [2 marks]

    Define as: sum of all observations divided by number of observations. Write: X = ΣX / N. Example: average marks in a test, average rainfall, average income, etc. with one sentence explanation.

    Calculate the Arithmetic Mean using the Assumed Mean method for the following weekly income of 10 families (in Rs): 850, 700, 100, 750, 5000, 80, 420, 2500, 400, 360. Show all working steps. Why would the Assumed Mean method be preferred over the Direct method for this data? [5 marks]

    Choose A = 850 (central value). Calculate d = X – A for each observation. Sum all deviations (Σd = 2660). Apply X = A + (Σd / N) = 850 + 2660/10 = Rs 1,116. Explain: Large figures and many observations make Direct method tedious; Assumed Mean method simplifies by reducing numbers into smaller deviations.

    In a housing colony, plot sizes are categorized as: 100 sq.m (200 plots), 200 sq.m (50 plots), and 300 sq.m (10 plots). (i) Calculate the mean plot size using the grouped data formula. (ii) Explain why frequency must be multiplied with each observation value. (iii) How would Baiju use this mean to assess if his 1-acre (4047 sq.m) plot is typical or unusual? Discuss the limitations of using only mean for this comparison. [6 marks]

    Use formula X = ΣfX / Σf. Calculate: (200×100 + 50×200 + 10×300) ÷ 260 = 126.92 sq.m. Explain frequency: Different counts of each size require weighted average, not simple average. For Baiju: 4047 sq.m >> 126.92 sq.m, so his plot is much larger—unusual. Limitations: Mean alone cannot show variability (range, spread); compare with median and mode too; single number hides distribution of holdings among farmers.

    Next chapterCorrelation →

    Practice with interactive flashcards, mind maps, upload your own chapters and get AI study kits instantly

    Try StudyOS Free →