**Central tendency** is a numerical method to summarise a large set of data by representing it with a single typical or representative value. The purpose is to condense vast information into one meaningful number that captures the essential characteristic of the data.
**Real-life examples where central tendency is used:**
**Why is summarisation needed?**
When dealing with large datasets, it becomes impossible to draw meaningful conclusions by examining each individual observation. A single value that represents the entire dataset helps in:
**Case Study: Baiju's Farm**
Baiju is a farmer in Balapur village, Buxar district, Bihar. To evaluate Baiju's economic condition relative to other small farmers, we would need to:
1. Compare his land holding with the **arithmetic mean** (average size) of all farmers' holdings
2. Check if his holding is above the size that half the farmers own (using **median**)
3. Determine if his holding is above what most farmers own (using **mode**)
This example illustrates why we need different types of averages for different analytical purposes.
---
**Definition:** The arithmetic mean is the sum of all values of observations divided by the total number of observations. It is the most commonly used measure of central tendency and is denoted by **X̄** (X-bar).
**Formula for Ungrouped Data:**
**X̄ = ΣX / N**
Where:
**General Form:**
X̄ = (X₁ + X₂ + X₃ + ... + Xₙ) / N
**Simple Example:**
If six families have monthly incomes of Rs 1600, 1500, 1400, 1525, 1625, and 1630:
X̄ = (1600 + 1500 + 1400 + 1525 + 1625 + 1630) / 6 = Rs 1,547
This means the average family income is Rs 1,547.
---
**Method:** Sum all observations and divide by the total number of observations.
**Steps:**
1. Add all the values in the dataset
2. Count the total number of observations
3. Divide the sum by the count
**Example 1: Marks of Students**
Calculate arithmetic mean from marks: 40, 50, 55, 78, 58
X̄ = (40 + 50 + 55 + 78 + 58) / 5 = 281 / 5 = **56.2 marks**
The average mark is 56.2.
**When to use:** When the number of observations is small and/or figures are small numbers.
---
**Purpose:** This method simplifies calculation when there are many observations or large numerical values.
**Logic:** Instead of adding all observations directly, we:
1. Assume a value (A) from the data as the mean
2. Calculate deviations (d) of each observation from this assumed mean
3. Find the average deviation and add it to the assumed mean
**Formula:**
**X̄ = A + (Σd / N)**
Where:
**Selection of Assumed Mean:** Choose a value that appears in the middle of the data to minimize deviation calculations and avoid large numbers.
**Example 2: Weekly Income of Families**
Family: A, B, C, D, E, F, G, H, I, J
Income: 850, 700, 100, 750, 5000, 80, 420, 2500, 400, 360
**Solution:**
Assume A = 850 (centrally located)
| Family | Income (X) | d = X - 850 |
|--------|-----------|-----------|
| A | 850 | 0 |
| B | 700 | -150 |
| C | 100 | -750 |
| D | 750 | -100 |
| E | 5000 | +4150 |
| F | 80 | -770 |
| G | 420 | -430 |
| H | 2500 | +1650 |
| I | 400 | -450 |
| J | 360 | -490 |
| **Total** | 11,160 | **+2,660** |
X̄ = 850 + (2,660 / 10) = 850 + 266 = **Rs 1,116**
---
**Purpose:** Further simplifies calculations by dividing deviations by a common factor, avoiding large numbers.
**Method:** When deviations (d) are large, divide them by a common factor (c) to get d'.
**Formula:**
**X̄ = A + (Σd' / N) × c**
Where:
**Advantage:** Reduces computational burden by working with smaller numbers.
**Example (from Example 2):**
If we divide deviations by 10:
| Family | Income (X) | d = X - 850 | d' = d/10 |
|--------|-----------|-----------|---------|
| A | 850 | 0 | 0 |
| B | 700 | -150 | -15 |
| C | 100 | -750 | -75 |
| D | 750 | -100 | -10 |
| E | 5000 | +4150 | +415 |
| F | 80 | -770 | -77 |
| G | 420 | -430 | -43 |
| H | 2500 | +1650 | +165 |
| I | 400 | -450 | -45 |
| J | 360 | -490 | -49 |
| **Total** | | | **+266** |
X̄ = 850 + (266/10) × 10 = 850 + 266 = **Rs 1,116**
---
**Discrete Series:** Data presented as values with their corresponding frequencies.
**Direct Method:**
**X̄ = ΣfX / Σf**
Where:
**Steps:**
1. Multiply each value (X) by its frequency (f) to get fX
2. Sum all fX values to get ΣfX
3. Sum all frequencies to get Σf
4. Divide ΣfX by Σf
**Example 3: Plot Sizes in Housing Colony**
Plots come in three sizes: 100 sq. m, 200 sq. m, 300 sq. m with frequencies 200, 50, 10 respectively.
| Plot Size (X) | Frequency (f) | fX |
|--------------|--------------|------|
| 100 | 200 | 20,000 |
| 200 | 50 | 10,000 |
| 300 | 10 | 3,000 |
| **Total** | **260** | **33,000** |
X̄ = 33,000 / 260 = **126.92 sq. metres**
The average plot size in the colony is 126.92 sq. metres.
**Assumed Mean Method for Discrete Series:**
**X̄ = A + (Σfd / Σf)**
Where d = X - A, and we weight each deviation by its frequency (fd).
**Step Deviation Method for Discrete Series:**
**X̄ = A + (Σfd' / Σf) × c**
Where d' = (X - A) / c
---
**Continuous Series:** Data presented in class intervals with frequencies.
**Key Difference:** Use the **midpoint (m)** of each class interval instead of the actual values.
**Midpoint Formula:** m = (Lower limit + Upper limit) / 2
**Direct Method:**
**X̄ = Σfm / Σf**
Where:
**Example 4: Average Marks (Exclusive Class Intervals)**
| Marks | Frequency (f) | Midpoint (m) | fm |
|-------|--------------|-------------|------|
| 0–10 | 5 | 5 | 25 |
| 10–20 | 12 | 15 | 180 |
| 20–30 | 15 | 25 | 375 |
| 30–40 | 25 | 35 | 875 |
| 40–50 | 8 | 45 | 360 |
| 50–60 | 3 | 55 | 165 |
| 60–70 | 2 | 65 | 130 |
| **Total** | **70** | | **2,110** |
X̄ = 2,110 / 70 = **30.14 marks**
**Step Deviation Method for Continuous Series:**
Choose assumed mean (A) = 35, common factor (c) = 10 (class width)
**X̄ = A + (Σfd' / Σf) × c**
| Marks | f | m | d' = (m-35)/10 | fd' |
|-------|--|---|----------------|-----|
| 0–10 | 5 | 5 | -3 | -15 |
| 10–20 | 12 | 15 | -2 | -24 |
| 20–30 | 15 | 25 | -1 | -15 |
| 30–40 | 25 | 35 | 0 | 0 |
| 40–50 | 8 | 45 | 1 | 8 |
| 50–60 | 3 | 55 | 2 | 6 |
| 60–70 | 2 | 65 | 3 | 6 |
| **Total** | **70** | | | **-34** |
X̄ = 35 + (-34/70) × 10 = 35 - 4.86 = **30.14 marks**
---
**Purpose:** When different values have different importance or weights, we calculate the weighted mean.
**Formula:**
**X̄w = (W₁X₁ + W₂X₂ + ... + WₙXₙ) / (W₁ + W₂ + ... + Wₙ)**
Or simply: **X̄w = ΣWX / ΣW**
Where:
**Practical Example:**
A consumer purchases mangoes at Rs 20/kg and potatoes at Rs 8/kg. Mangoes comprise 30% of the budget, potatoes 70%.
Weighted average price = (0.30 × 20 + 0.70 × 8) / (0.30 + 0.70) = (6 + 5.6) / 1 = **Rs 11.60 per kg**
This is more realistic than simple average (20 + 8)/2 = 14 because it reflects the consumer's actual spending pattern.
---
**Property 1: Sum of Deviations from Mean is Zero**
**Σ(X - X̄) = 0**
This means if we calculate how much each observation differs from the mean and sum these differences, the result is always zero.
**Example:** Data: 4, 6, 8, 10, 12
Mean = (4 + 6 + 8 + 10 + 12) / 5 = 8
Deviations from 8: (4-8), (6-8), (8-8), (10-8), (12-8) = -4, -2, 0, +2, +4
Sum = -4 - 2 + 0 + 2 + 4 = **0**
**Property 2: Arithmetic Mean is Affected by Extreme Values**
The A.M. is highly sensitive to outliers or extreme values. If one very large or very small value is added to the dataset, the mean shifts significantly.
**Example:** Data: 1, 2, 3 → Mean = 2
Data: 1, 2, 300 → Mean = 101
The inclusion of 300 dramatically increases the mean, even though most values are small. This is a major limitation of the mean as a measure of central tendency.
**Advantages of Arithmetic Mean:**
**Disadvantages of Arithmetic Mean:**
---
**Definition:** The **median** is the positional value that divides the distribution into two equal parts. It is the middle value when data is arranged in ascending or descending order.
**Key Characteristic:** The median is based on position, not magnitude of values, making it **unaffected by extreme values**.
**Conceptual Importance:** While the mean represents the average, the median represents the typical middle observation. It shows what value exactly 50% of the observations fall below and 50% fall above.
---
**Steps:**
1. Arrange all observations in ascending order (from smallest to largest)
2. Find the middle position using: **Position = (N + 1) / 2**
3. The observation at this position is the median
**Case 1: Odd Number of Observations**
When N is odd, there is exactly one middle observation.
**Example 5: Simple Dataset**
Data: 5, 7, 6, 1, 8, 10, 12, 4, 3
**Step 1:** Arrange in ascending order: 1, 3, 4, 5, **6**, 7, 8, 10, 12
**Step 2:** Position = (9 + 1) / 2 = 5th position
**Step 3:** The 5th observation is **6**
**Median = 6**
Interpretation: Half the values (1, 3, 4, 5) are below 6, and half (7, 8, 10, 12) are above 6.
**Case 2: Even Number of Observations**
When N is even, there are two middle observations. The median is the average of these two middle values.
**Example 6: Even-Sized Dataset**
Data (20 students' marks): 25, 72, 28, 65, 29, 60, 30, 54, 32, 53, 33, 52, 35, 51, 42, 48, 45, 47, 46, 33
**Step 1:** Arrange in ascending order: 25, 28, 29, 30, 32, 33, 33, 35, 42, **45, 46**, 47, 48, 51, 52, 53, 54, 60, 65, 72
**Step 2:** Position = (20 + 1) / 2 = 10.5th position
This means the median lies between the 10th and 11th observations.
**Step 3:** The 10th value is 45 and 11th value is 46
**Median = (45 + 46) / 2 = 45.5 marks**
Interpretation: Exactly 50% of students scored ≤ 45.5 and 50% scored ≥ 45.5 marks.
---
**Method:** Use cumulative frequency (cf) to locate the position (N+1)/2 and identify the corresponding value.
**Steps:**
1. Prepare cumulative frequency column
2. Calculate position = (N + 1) / 2
3. Find this position in the cumulative frequency column
4. The value corresponding to this cf is the median
**Example 7: Income Distribution**
| Income (Rs) | Frequency (f) | Cumulative Frequency (cf) |
|------------|--------------|--------------------------|
| 10 | 2 | 2 |
| 20 | 4 | 6 |
| 30 | 10 | 16 |
| 40 | 4 | 20 |
| **Total** | **N=20** | |
**Step 1:** Position of median = (20 + 1) / 2 = 10.5th item
**Step 2:** Look at cumulative frequency: 2, 6, **16**, 20
The 10.5th position falls in the cf of 16.
**Step 3:** The income corresponding to cf = 16 is **Rs 30**
**Median = Rs 30**
Interpretation: 50% of persons earn Rs 30 or less, and 50% earn Rs 30 or more.
---
**Method:** Use the **N/2** position (not (N+1)/2) and apply the median formula.
**Steps:**
1. Prepare cumulative frequency table
2. Calculate N/2 position
3. Identify the **median class** (the class containing the N/2th observation)
4. Apply the median formula
**Median Formula:**
**Median = L + [(N/2 - cf) / f] × h**
Where:
**Example 8: Daily Wages**
| Daily Wages (Rs) | Workers (f) | Cumulative Frequency (cf) |
|-----------------|------------|--------------------------|
| 20–25 | 14 | 14 |
| 25–30 | 28 | 42 |
| 30–35 | 33 | 75 |
| 35–40 | 30 | 105 |
| 40–45 | 20 | 125 |
| 45–50 | 15 | 140 |
| 50–55 | 13 | 153 |
| 55–60 | 7 | 160 |
| **Total** | **N=160** | |
**Step 1:** N/2 = 160/2 = 80
**Step 2:** Find where 80 falls in cf: The cf sequence is 14, 42, 75, **105**, 153, ...
The 80th observation falls in the **35–40 class**.
**Step 3:** For the median class 35–40:
**Step 4:** Apply formula:
Median = 35 + [(80 - 75) / 30] × 5 = 35 + (5/30) × 5 = 35 + 0.83 = **Rs 35.83**
**Interpretation:** Exactly 50% of workers earn Rs 35.83 or less, and 50% earn Rs 35.83 or more.
---
**Activity Observation:**
| Series | Values | Mean | Median | Observation |
|--------|--------|------|--------|------------|
| A | 1, 2, 3 | 2 | 2 | Same |
| B | 1, 2, 30 | 11 | 2 | Mean shifted by outlier |
| C | 1, 2, 300 | 101 | 2 | Mean much higher |
| D | 1, 2, 3000 | 1001 | 2 | Mean extremely high |
**Key Finding:** The median remains **2** in all cases, while the mean increases dramatically with outliers. This proves the median is **resistant to extreme values**.
**Outliers:** Values that are extremely different from other observations in the dataset. They can distort the mean but cannot affect the median significantly.
**When to Use Which:**
---
**Definition:** **Quartiles** are positional measures that divide the entire distribution into four equal parts, each containing 25% of the observations.
**Three Quartiles:**
**Q₁ (First Quartile/Lower Quartile):**
**Q₂ (Second Quartile/Median):**
**Q₃ (Third Quartile/Upper Quartile):**
**Interquartile Range (IQR):** The range between Q₁ and Q₃ contains the central 50% of the data. IQR = Q₃ - Q₁
---
**Formula for Ungrouped/Discrete Data:**
**Q₁ = Size of [(N + 1) / 4]th item**
**Q₃ = Size of [3(N + 1) / 4]th item**
**Method:**
1. Arrange data in ascending order
2. Apply the position formula
3. If position is a whole number, that observation is the quartile
4. If position is decimal (like 2.75), interpolate between the two surrounding observations
**Example 9: Lower Quartile Calculation**
Data (10 students' marks): 22, 26, 14, 30, 18, 11, 35, 41, 12, 32
**Step 1:** Arrange in ascending order: 11, 12, 14, 18, 22, 26, 30, 32, 35, 41
**Step 2:** Q₁ position = (10 + 1) / 4 = 11/4 = 2.75th item
**Step 3:** The 2.75th position means:
This means 25% of students scored 13.5 marks or less.
**Finding Q₃:**
Q₃ position = 3(10 + 1) / 4 = 33/4 = 8.25th item
This means 75% of students scored 32.75 marks or less.
**Interpretation:** The central 50% of students scored between 13.5 and 32.75 marks.
---
**Definition:** **Percentiles** are positional measures that divide the distribution into 100 equal parts. There are 99 dividing positions: P₁, P₂, P₃, ..., P₉₉.
**Key Percentiles:**
**Interpretation:** If you score at the **82nd percentile** on an examination:
**Real-Life Application in India:**
---
**Definition:** The **mode** is the value that occurs with the highest frequency in a dataset. It is the most typical or most fashionable value around which maximum concentration of items occurs. Denoted by **Mo**.
**Derived from:** French word "la Mode" meaning the most fashionable value.
**Practical Applications:**
---
**Method:** Identify the observation with the highest frequency.
**Example 10: Size Frequency Distribution**
| Variable (Size) | 10 | 20 | 30 | 40 | 50 |
|-----------------|----|----|----|----|-----|
| Frequency | 2 | 8 | 20 | 10 | 5 |
**Highest frequency = 20**
**Corresponding value = 30**
**Mode = 30**
Interpretation: The size 30 is the most frequently occurring, so it has the highest demand.
---
**Unimodal Distribution:**
**Bimodal Distribution:**
**Multimodal Distribution:**
**No Mode:**
---
**Method:** Identify the **modal class** (class with highest frequency), then use the mode formula.
**Modal Class:** The class interval with the maximum frequency.
**Mode Formula:**
**Mode = L + [f₁ - f₀ / 2f₁ - f₀ - f₂] × h**
Or alternatively:
**Mode = L + [(f₁ - f₀) / (2f₁ - f₀ - f₂)] × h**
Where:
**Assumptions:**
---
1. Identify the class with the highest frequency → this is the modal class
2. Note down L, f₁, f₀, f₂, and h from the table
3. Substitute values in the mode formula
4. Calculate and obtain the mode value
**Example (Hypothetical):**
| Marks | 0–10 | 10–20 | 20–30 | 30–40 | 40–50 |
|-------|------|-------|-------|-------|-------|
| Frequency | 5 | 12 | **25** | 20 | 8 |
**Modal Class:** 20–30 (highest frequency = 25)
Mode = 20 + [(25 - 12) / (2×25 - 12 - 20)] × 10
Mode = 20 + [13 / (50 - 32)] × 10
Mode = 20 + [13 / 18] × 10
Mode = 20 + 7.22 = **27.22 marks**
---
| Feature | Mean | Median | Mode |
|---------|------|--------|------|
| **Definition** | Sum/Total | Middle value | Highest frequency |
| **Position-based** | No | Yes | Yes |
| **Affected by Outliers** | Yes | No | No |
| **Unique Value** | Always | Always | May have multiple |
| **Calculation** | Uses all values | Uses position only | Uses frequency |
| **Best for** | Normal distribution | Skewed data | Categorical data |
| **Practical Use** | Average salary | Median house price | Popular product size |
---
**Advantages:**
**Disadvantages:**
---
**Use Arithmetic Mean when:**
**Use Median when:**
**Use Mode when:**
1. **Arithmetic Mean (ungrouped):** X̄ = ΣX / N
2. **Arithmetic Mean (grouped):** X̄ = ΣfX / Σf
3. **Assumed Mean Method:** X̄ = A + (Σd / N)
4. **Step Deviation Method:** X̄ = A + [(Σd' / N) × c]
5. **Weighted Mean:** X̄w = ΣWX / ΣW
6. **Median Position:** (N + 1) / 2 for ungrouped; N / 2 for grouped
7. **Median Formula (continuous):** Median = L + [(N/2 - cf) / f] × h
8. **Q₁ Position:** (N + 1) / 4
9. **Q₃ Position:** 3(N + 1) / 4
10. **Mode Formula (continuous):** Mode = L + [(f₁ - f₀) / (2f₁ - f₀ - f₂)] × h
Q1. What is the Arithmetic Mean of the following marks: 40, 50, 55, 78, 58?
Answer: A — Sum = 40 + 50 + 55 + 78 + 58 = 281; Mean = 281 ÷ 5 = 56.2.
Q2. In the Assumed Mean method, if A = 850 and d = X – A = –150, what is the value of X?
Answer: A — Since d = X – A, then X = A + d = 850 + (–150) = 700.
Q3. Which method is most suitable when data contains very large numerical figures?
Answer: B — Assumed Mean and Step Deviation methods reduce computational complexity by working with smaller deviation values instead of large original figures.
Q4. For grouped discrete data with frequencies, the correct formula for Arithmetic Mean is:
Answer: C — In grouped data, each value X must be weighted by its frequency f; thus X = ΣfX / Σf is the correct formula.
Q5. A housing colony has 200 plots of 100 sq.m, 50 plots of 200 sq.m, and 10 plots of 300 sq.m. What is the mean plot size?
Answer: A — Using ΣfX / Σf = (200×100 + 50×200 + 10×300) / (200 + 50 + 10) = 33,000 / 260 = 126.92 sq.m.
Q6. In the Step Deviation method with c = 10, if Σd' = 266 and N = 10, the adjustment to assumed mean is:
Answer: A — The adjustment = (Σd' / N) × c = (266 / 10) × 10 = 26.6, which is added to the assumed mean.
Q7. Which statement is CORRECT about central tendency?
Answer: C — Central tendency is designed to represent an entire dataset with one typical value; it works for both ungrouped and grouped data, but different averages suit different situations.
Q8. Assertion (A): The Assumed Mean method always gives a different result than the Direct method. Reason (R): Both methods calculate the same mean but use different calculation paths.
Answer: C — Both methods calculate the identical mean value; the Assumed Mean method simply provides a computational shortcut. The assertion is false because results are the same, not different.
Q9. If the assumed mean is set too far from the actual data center, which problem occurs in calculations?
Answer: B — A poor choice of assumed mean far from the data's center produces large deviations (d values), making arithmetic more cumbersome, though the final result remains correct if calculated properly.
Q10. In Baiju's village example, comparing his 1-acre holding to the average holding of 50 farmers helps determine which economic aspect?
Answer: B — The comparison of Baiju's holding to the average (central tendency) reveals whether he is above, below, or at the typical economic level of his village, showing his relative standing.
What is the formula for Arithmetic Mean in ungrouped data?
X = ΣX / N, where ΣX is the sum of all observations and N is the total number of observations.
When should the Assumed Mean method be used instead of Direct method?
When the data contains a large number of observations or very large numerical figures that make direct calculation tedious.
What is a deviation (d) in the Assumed Mean method?
Deviation d = X – A, where X is an individual observation and A is the assumed mean.
How does the Step Deviation method differ from the Assumed Mean method?
Step Deviation divides all deviations by a common factor c to simplify calculations: d' = d / c, avoiding large numbers.
What is the formula for Arithmetic Mean in grouped discrete data?
X = ΣfX / Σf, where f is frequency and X is the observation value.
Why is the central tendency called a 'representative value'?
Because one single number summarizes an entire dataset in a way that represents all observations collectively.
In the Baiju example, why do we need three averages (mean, median, mode) and not just one?
Because each average tells a different story—mean shows mathematical average, median shows middle position, and mode shows most common holding size.
What does the summation symbol Σ mean?
It means 'sum of' or 'add up all'—for example, ΣX means add all X values together.
In Example 2, why was 850 chosen as the assumed mean?
Because 850 is a centrally located value in the data, which simplifies calculations and reduces the size of deviations.
What is the key difference between ungrouped and grouped data in calculating mean?
Ungrouped data uses X values directly, while grouped data must multiply each X by its frequency f before summing.
Define Arithmetic Mean and write its formula for ungrouped data. Give one real-life example. [2 marks]
Define as: sum of all observations divided by number of observations. Write: X = ΣX / N. Example: average marks in a test, average rainfall, average income, etc. with one sentence explanation.
Calculate the Arithmetic Mean using the Assumed Mean method for the following weekly income of 10 families (in Rs): 850, 700, 100, 750, 5000, 80, 420, 2500, 400, 360. Show all working steps. Why would the Assumed Mean method be preferred over the Direct method for this data? [5 marks]
Choose A = 850 (central value). Calculate d = X – A for each observation. Sum all deviations (Σd = 2660). Apply X = A + (Σd / N) = 850 + 2660/10 = Rs 1,116. Explain: Large figures and many observations make Direct method tedious; Assumed Mean method simplifies by reducing numbers into smaller deviations.
In a housing colony, plot sizes are categorized as: 100 sq.m (200 plots), 200 sq.m (50 plots), and 300 sq.m (10 plots). (i) Calculate the mean plot size using the grouped data formula. (ii) Explain why frequency must be multiplied with each observation value. (iii) How would Baiju use this mean to assess if his 1-acre (4047 sq.m) plot is typical or unusual? Discuss the limitations of using only mean for this comparison. [6 marks]
Use formula X = ΣfX / Σf. Calculate: (200×100 + 50×200 + 10×300) ÷ 260 = 126.92 sq.m. Explain frequency: Different counts of each size require weighted average, not simple average. For Baiju: 4047 sq.m >> 126.92 sq.m, so his plot is much larger—unusual. Limitations: Mean alone cannot show variability (range, spread); compare with median and mode too; single number hides distribution of holdings among farmers.
Practice with interactive flashcards, mind maps, upload your own chapters and get AI study kits instantly
Try StudyOS Free →