---
**Statistical thinking** is the way we think about data and make judgments based on information we observe from the world around us. When your teacher mentions two friends—one 5 feet tall and one 6 feet tall—and you guess genders based on height, you are using statistical thinking. You know that on average, men tend to be taller than women, but you also understand that exceptions exist (5-foot-tall men and 6-foot-tall women do occur, just rarely).
A **statistical statement** is a claim or summary about some phenomenon expressed in terms of numerical values, proportions, probabilities, or predictions.
**Examples of statistical statements:**
All these statements use numbers and data to describe real-world phenomena.
A **statistical question** is a question that can be answered by collecting and analyzing data. The key feature of a statistical question is that it expects answers that will vary.
**Characteristics of statistical questions:**
**Example:** "How tall are Grade 7 students in our school?" is a statistical question because:
**Another example:** "Typically, are onions costlier in Yahapur or Wahapur?" is a statistical question because:
**Examples identifying statistical questions:**
(a) **"What is the price of a tennis ball in India?"** — NOT statistical. Prices are fixed at a particular moment; they don't vary in the sense of data that needs collecting from multiple sources to describe a pattern.
(b) **"How old are the dogs that live on this street?"** — YES, statistical. Different dogs have different ages; we need to collect data and analyze.
(c) **"What fraction of the students in your class like walking up a hill?"** — YES, statistical. Requires surveying students and analyzing responses that will vary.
(d) **"Do you like reading?"** — NOT statistical. This is a simple yes/no question about one person's preference, not about variability in a group.
(e) **"Approximately how many bricks are in this wall?"** — YES, statistical. Estimation based on collecting data about brick sizes and wall dimensions.
(f) **"Who was the best bowler in the match yesterday?"** — NOT statistical. This asks for a specific fact about one event, not data showing variability.
(g) **"What was the rainfall pattern in Barmer last year?"** — YES, statistical. Rainfall varies across days and months; we need data and analysis to describe the pattern.
**Statistics** is the study of collecting, organizing, analyzing, interpreting, and presenting data.
The five main components:
1. **Collecting data** — gathering information systematically
2. **Organizing data** — arranging it in tables, lists, or other forms
3. **Analyzing data** — finding patterns, calculating representative values
4. **Interpreting data** — drawing conclusions and making sense of findings
5. **Presenting data** — showing results through tables, graphs, or descriptions
---
When we have a collection of numbers, comparing them one-by-one can be confusing. For example, consider two cricket players' performance over 4 matches:
**Shubman:** 0, 17, 21, 90 runs
**Yashasvi:** 67, 55, 18, 35 runs
Different ways to compare:
**The challenge:** When comparing players with different numbers of matches played, the total runs scored may not be fair. For example:
Simply comparing totals (110 > 96) is unfair because Shubman played one more match. We need a **representative value** that accounts for different group sizes.
**Definition:** The **Arithmetic Mean** (or simply **Mean** or **Average**) is a single number that represents all values in a group of data. It is calculated by adding all values and dividing by the count of values.
**Formula:**
$$\text{Mean} = \frac{\text{Sum of all values in the data}}{\text{Number of values in the data}}$$
Or written as:
$$\text{Mean} = \frac{\text{Total}}{\text{Count}}$$
**Calculation for the cricket series:**
For Shubman (5 matches):
$$\text{Mean} = \frac{23 + 7 + 10 + 52 + 18}{5} = \frac{110}{5} = 22 \text{ runs per match}$$
For Yashasvi (4 matches):
$$\text{Mean} = \frac{26 + 53 + 2 + 15}{4} = \frac{96}{4} = 24 \text{ runs per match}$$
**Conclusion:** Although Shubman scored 110 total runs vs. Yashasvi's 96, Yashasvi's average (24 runs/match) is higher than Shubman's average (22 runs/match). This shows Yashasvi performed better on average.
The average can also be understood as **fair-share** or **equal-share**. If items are divided equally among people, each person gets the average amount.
**Example:** Fruit Distribution
**Shreyas's group** collected guavas: 3, 8, 10, 5, 4
**Parag's group** collected guavas: 5, 4, 6, 3, 4, 8
Even though both groups collected 30 guavas, each person in Shreyas's group gets 1 more guava (6 vs. 5) because there are fewer people to share with.
**Visual representation of fair-share:**
If we redistribute the guavas equally:
**Problem:** Vaishnavi tracks the number of Hibiscus flowers blooming in her garden each day. The data for the last few days is 2, 7, 9, 4, 3. What is the average number of flowers blooming per day?
**Solution:**
$$\text{Average} = \frac{\text{Total flowers}}{\text{Number of days}} = \frac{2 + 7 + 9 + 4 + 3}{5} = \frac{25}{5} = 5 \text{ flowers per day}$$
**Interpretation:** On average, 5 Hibiscus flowers bloom daily. This means if the same number of flowers bloomed each day, there would be 5 per day to get the same total.
The Arithmetic Mean was used and valued in ancient Indian mathematics with special terminology:
The terminology shows ancient Indian scholars understood the Arithmetic Mean as the **common value** or **equalising value** that represents a collection of values.
**Problem 1:** Ball Bouncing
Shreyas bounces a ball on a bat 8 times: 6, 2, 9, 5, 4, 6, 3, 5 bounces per attempt.
**Solution:**
$$\text{Mean} = \frac{6 + 2 + 9 + 5 + 4 + 6 + 3 + 5}{8} = \frac{40}{8} = 5 \text{ bounces per attempt}$$
**Problem 2:** Runner Comparison
Two friends training for 100m race. Their times over a week (in seconds):
**Solution for Nikhil:**
$$\text{Mean} = \frac{17 + 18 + 17 + 16 + 19 + 17 + 18}{7} = \frac{122}{7} ≈ 17.43 \text{ seconds}$$
**Solution for Sunil:**
$$\text{Mean} = \frac{20 + 18 + 18 + 17 + 16 + 16 + 17}{7} = \frac{122}{7} ≈ 17.43 \text{ seconds}$$
Both have the same average time! But looking at individual times, Nikhil's times are more consistent (closer to average), while Sunil's include a 20-second attempt that's slower.
**Problem 3:** School Enrolment
Enrolment over 6 consecutive years: 1555, 1670, 1750, 2013, 2040, 2126
**Solution:**
$$\text{Mean} = \frac{1555 + 1670 + 1750 + 2013 + 2040 + 2126}{6} = \frac{12154}{6} ≈ 2025.67$$
The mean enrolment is approximately 2026 students.
---
A real-world example of comparing two locations (Yahapur and Wahapur) based on monthly onion prices:
| Month | Yahapur | Month | Wahapur |
|-------|---------|-------|---------|
| January | 25 | January | 19 |
| February | 24 | February | 17 |
| March | 26 | March | 23 |
| April | 28 | April | 30 |
| May | 30 | May | 38 |
| June | 35 | June | 35 |
| July | 39 | July | 42 |
| August | 43 | August | 39 |
| September | 49 | September | 53 |
| October | 56 | October | 60 |
| November | 59 | November | 52 |
| December | 44 | December | 42 |
**Question:** Where are onions costlier?
Different students analyzed the data differently:
**Khushboo's analysis:** "I think Wahapur is costlier because it has the highest price of ₹60."
**Nafisa's analysis:** "I added the prices of all months in each location - Yahapur's total is 458, whereas Wahapur's total is 450."
**Vishal's analysis:** "Wahapur is costlier since it has 3 numbers in the 50s."
**Sampat's analysis:** "I compared prices in each month in both locations. Prices in Yahapur are higher for 6 months, prices in Wahapur are higher for 5 months, and the prices are the same for 1 month. So, Yahapur is costlier."
**Jithin's analysis:** "I noticed that the difference between the highest and lowest prices in Yahapur is 59 – 24 = 35, and in Wahapur it is 60 – 17 = 43."
Data can be described and compared using:
1. **Minimum value** — the lowest price
2. **Maximum value** — the highest price
3. **Average (Mean) value** — the central tendency
4. **Sum total** — the total of all values (useful when comparing equal groups)
5. **Range** — difference between max and min, showing variability
**Calculation of averages for onion prices:**
For **Yahapur:**
$$\text{Mean} = \frac{458}{12} ≈ 38.17 \text{ rupees per kg}$$
For **Wahapur:**
$$\text{Mean} = \frac{450}{12} = 37.5 \text{ rupees per kg}$$
**Conclusion using mean:** Yahapur's average price (₹38.17) is slightly higher than Wahapur's (₹37.5), confirming Yahapur is costlier on average.
A **dot plot** is a way to visualize data by placing dots on a number line. Each dot represents one occurrence of a value.
**Features of dot plots:**
**Reading the dot plot for onion prices:**
```
Yahapur (green dots):
10 20 30 40 50 60
● ●● ●●●
● ● ●●●
Wahapur (purple dots):
10 20 30 40 50 60
●● ●●●●
```
**Advantages of dot plots:**
**Limitation of dot plots:**
**Observation from the dot plot:**
This visualization helps understand that while Yahapur is slightly costlier on average, Wahapur has more volatile pricing.
---
Looking at variations in data can spark curiosity. With the onion price data, one might wonder:
Observing and trying to make sense of data can reveal interesting patterns and trigger curiosity in different directions. This is what makes statistics valuable beyond just calculating numbers.
---
The Arithmetic Mean is frequently used in different fields:
**Agriculture:**
**Entertainment:**
**Transportation:**
**Environmental:**
**Weather:**
**Technology:**
1. **Simple definition** — easy to understand and explain
2. **Easy to calculate** — just add and divide
3. **Works with any kind of numerical data**
4. **Useful for comparisons** — between groups or over time
5. **Has mathematical properties** that make it useful for further analysis
---
Sometimes the average doesn't give a fair picture of data, especially when there are **outliers**.
**Yaangba's family heights (cm):** 169, 173, 155, 165, 160, 164
**Poovizhi's family heights (cm):** 170, 173, 165, 118, 175
**Question:** Which family is taller?
**Calculating means:**
For Yaangba:
$$\text{Mean} = \frac{169 + 173 + 155 + 165 + 160 + 164}{6} = \frac{986}{6} ≈ 164.3 \text{ cm}$$
For Poovizhi:
$$\text{Mean} = \frac{170 + 173 + 165 + 118 + 175}{5} = \frac{801}{5} = 160.2 \text{ cm}$$
**Conclusion from means:** Yaangba's family (164.3 cm) is taller on average.
**But is this fair?** Looking at the actual heights:
The problem: Poovizhi's family has one very young child who is 118 cm tall. This value significantly pulls down the average.
**An outlier** is a value that significantly deviates from the rest of the values in the data. It is an unusual or extreme value that doesn't fit the general pattern.
**In Poovizhi's family:**
**The median** is the middle value when data is arranged in order.
**Steps to find the median:**
1. Arrange all values in order (from smallest to largest)
2. If odd number of values: pick the middle one
3. If even number of values: find the average of the two middle values
**Finding median height of Poovizhi's family:**
1. Heights in order: 118, 165, 170, 173, 175 (5 values — odd count)
2. Middle position: 3rd value (with 2 values below and 2 above)
3. **Median = 170 cm**
**Finding median height of Yaangba's family:**
1. Heights in order: 155, 160, 164, 165, 169, 173 (6 values — even count)
2. Two middle positions: 3rd value (164) and 4th value (165)
3. **Median = (164 + 165) ÷ 2 = 164.5 cm**
**For Yaangba's family (no outlier):**
**For Poovizhi's family (with outlier):**
**Important observation:** When outliers are present, the mean is affected much more than the median.
If we remove the 118 cm value from Poovizhi's family:
**Compare to with outlier:**
The mean changed by 10.55 cm, but the median only changed by 1.5 cm. The median is more robust to outliers!
**Problem:** After summer vacation, a class teacher asked students how many short stories they had read. The data collected was:
2, 5, 4, 6, 5, 3, 7, 6, 5, 4, 40, 6, 5, 4
**Find mean and median. Can you identify an outlier?**
**Solution:**
**Mean:**
$$\text{Mean} = \frac{2 + 5 + 4 + 6 + 5 + 3 + 7 + 6 + 5 + 4 + 40 + 6 + 5 + 4}{14}$$
$$= \frac{102}{14} ≈ 7.3 \text{ short stories}$$
**Median:** Arrange in order: 2, 3, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 7, 40
14 values (even), so median is average of 7th and 8th values:
$$\text{Median} = \frac{5 + 5}{2} = 5 \text{ short stories}$$
**Identifying the outlier:** The value 40 is much higher than all others (which are between 2-7). This is clearly an outlier at the higher end.
**Interpretation:** The median value 5 means that half the class read 5 or more stories, which is more representative. The outlier (40 stories) pulls the mean up to 7.3, making it seem like the class read more than they typically did.
**Without the outlier (removing 40):**
Notice: The mean dropped significantly from 7.3 to 4.77, but the median stayed at 5!
**Problem:** A newspaper's page count from Monday to Sunday: 16, 18, 20, 22, 26, 16, 10
**Find mean and median. Identify any outliers.**
**Solution:**
**Mean:**
$$\text{Mean} = \frac{16 + 18 + 20 + 22 + 26 + 16 + 10}{7} = \frac{128}{7} ≈ 18.3 \text{ pages}$$
**Median:** Arrange in order: 10, 16, 16, 18, 20, 22, 26 (7 values — odd)
Middle value is 4th: **Median = 18 pages**
**Observations:**
---
**Measures of central tendency** refer to values that represent the center or middle of a distribution. The tendency of values to pile up around a particular value.
**The two main measures:**
1. **Mean (Average)** — sum of all values divided by count
2. **Median** — middle value when arranged in order
**When to use each:**
Beyond central tendency, we can measure how spread out or variable data is.
**Range** = Maximum value - Minimum value
This tells us about the spread or dispersion of data.
For onion prices:
Wahapur has greater variability in prices.
---
When analyzing data, important aspects include:
1. **The extremes** — minimum and maximum values
2. **Central tendency** — mean and median
3. **Variability** — how spread out the data is
**Scenario:** Grade 5 class heights in centimeters
**Boys' heights:** 147, 135, 130, 154, 128, 135, 134, 158, 155, 146, 146, 142, 140, 141, 144, 145, 150 (17 students)
**Girls' heights:** 143, 136, 150, 144, 154, 140, 145, 148, 156, 150, 150 (11 students)
**Total class:** 28 students
**Calculating measures:**
**For the whole class:**
**For boys only:**
**For girls only:**
1. **Girls are taller on average:** Mean for girls (146.9) > Mean for boys (142.94)
2. **Boys have lower median:** 144 cm vs. 148 cm for girls
3. **Whole class mean (144.4) is between boys and girls**
4. **Boys form a larger group** (17 vs. 11), so their values have more influence on class average
5. **Variability matters:** Looking at just the mean doesn't tell the complete story about the class's heights
Using dot plots, we can see:
---
$$\text{Mean} = \frac{\text{Sum of all values}}{\text{Number of values}}$$
Arrange in order, pick the middle value (position = (n+1)/2, where n = count)
$$\text{Median} = \frac{\text{(n/2)th value + (n/2 + 1)th value}}{2}$$
$$\text{Range} = \text{Maximum value} - \text{Minimum value}$$
---
**Statistical statement:** A claim or summary about a phenomenon using numerical values, proportions, probabilities, or predictions.
**Statistical question:** A question answerable by collecting data, expecting varied answers, requiring analysis.
**Statistics:** The study of collecting, organizing, analyzing, interpreting, and presenting data.
**Mean/Average:** The sum of all values divided by the count of values; represents fair-share.
**Median:** The middle value when data is arranged in order; unaffected by extreme outliers.
**Outlier:** A value that significantly deviates from other values in the dataset.
**Measure of central tendency:** A value representing the center/middle of data distribution (mean
Q1. Which of the following is a statistical question?
Answer: B — A statistical question requires data collection and analysis because answers vary; option B is about typical spending across many students, while A, C, and D have single or fixed answers.
Q2. The average price of mangoes at a market over 5 days is ₹40 per kg. What is the total cost?
Answer: D — Average tells us the mean price per day, but without knowing the quantity bought each day, we cannot calculate total cost.
Q3. Shreyas scored 10, 20, 30, 40, and 50 marks in five tests. What is his average score?
Answer: A — Sum = 10 + 20 + 30 + 40 + 50 = 150; Average = 150 ÷ 5 = 30.
Q4. A farmer collected guavas: Day 1 = 5, Day 2 = 8, Day 3 = 7. If he wants to sell equal bundles each day, how many guavas per bundle?
Answer: B — Total guavas = 5 + 8 + 7 = 20; Days = 3; Average = 20 ÷ 3 ≈ 6.67, which rounds to 6 or 7 per bundle (context dependent), but exact average is 6 whole + fraction.
Q5. Two batsmen scored runs as follows — Batter A: 15, 25, 35 in three matches; Batter B: 20, 20, 20 in three matches. Who has a more consistent batting record?
Answer: B — Batter B's scores vary less (range = 0), showing consistency, while Batter A's scores vary from 15 to 35 (range = 20).
Q6. Onion prices over 4 months: ₹20, ₹25, ₹30, ₹45. The average price is ₹30. If the store says 'typical monthly price is ₹30', what does this mean?
Answer: B — The fair-share interpretation of average shows that if total (₹120) were divided equally over 4 months, each would get ₹30.
Q7. A dot plot shows onion prices in two towns. Wahapur's dots are spread across 17–60 rupees, while Yahapur's are between 24–59 rupees. What can we infer?
Answer: A — Yahapur's range (24–59 = 35) is smaller than Wahapur's range (17–60 = 43), meaning Yahapur has less price variation.
Q8. Nikhil ran 100 m in: 17, 18, 17, 16, 19, 17, 18 seconds. What is his average running time?
Answer: B — Sum = 17 + 18 + 17 + 16 + 19 + 17 + 18 = 122; Count = 7; Average = 122 ÷ 7 ≈ 17.43 ≈ 17.3 seconds.
Q9. Two groups of students collected money for charity. Group 1 collected ₹150 from 5 students; Group 2 collected ₹180 from 6 students. Which group's average contribution per student is higher?
Answer: C — Group 1: 150 ÷ 5 = ₹30 per student; Group 2: 180 ÷ 6 = ₹30 per student — both averages are equal.
Q10. Why is a dot plot more useful than a raw table for spotting which price range onions fall into most frequently?
Answer: B — Dot plots arrange data on a number line, making it easy to see where prices cluster and how they spread, unlike a table which lists values sequentially.
What is a statistical question?
A question that requires collecting and analysing data because answers vary, like 'How tall are Grade 7 students in our school?'
Define the arithmetic mean (average).
The sum of all values in the data divided by the number of values.
Why is average better than total when comparing two groups of different sizes?
Because total depends on group size, but average shows performance per unit, giving a fair comparison.
What does a dot plot show?
Data points as dots on a horizontal line, revealing clusters, spread, and frequency of values.
What is the fair-share interpretation of average?
If we redistribute all values equally among all people, each person gets the average amount.
Calculate the average: 6, 2, 9, 5, 4, 6, 3, 5.
Sum = 40, Count = 8, Average = 40 ÷ 8 = 5.
What information does a dot plot lose compared to a table?
A dot plot loses the original order or sequence of data (like month-wise prices).
Name three ways to describe and compare data.
Using minimum value, maximum value, average value, sum total, or range (difference between max and min).
What is the range of data, and how is it calculated?
Range is the difference between the highest and lowest values: Range = Maximum − Minimum.
Why did ancient Indian mathematicians use the word 'sama' in arithmetic mean?
Because 'sama' means equal, showing that the mean represents the common or equalising value of a collection.
What is a statistical question? Give one example. [1 mark]
A statistical question requires data collection because answers vary; example should show variability (e.g., 'How many hours do Class 7 students study daily?').
Vaishnavi tracked hibiscus flowers blooming daily: 2, 7, 9, 4, 3. Calculate the average number of flowers blooming per day. What does this average tell us? [2 marks]
Step 1: Add all values (2 + 7 + 9 + 4 + 3 = 25). Step 2: Divide by number of days (25 ÷ 5 = 5). Interpretation: If equal flowers bloomed each day, it would be 5 flowers (fair-share idea).
Two runners trained for a 100 m race. Nikhil's times (in seconds): 17, 18, 17, 16, 19, 17, 18. Sunil's times: 20, 18, 18, 17, 16, 16, 17. Calculate the average time for each and state who ran quicker on average. Show all steps. [3 marks]
Step 1: Find Nikhil's sum (122) and divide by 7 to get average ≈ 17.4 seconds. Step 2: Find Sunil's sum (122) and divide by 7 to get average ≈ 17.4 seconds. Step 3: Compare and conclude (both are approximately equal, or state which is lower if calculations differ).
The monthly onion prices (in ₹/kg) at two towns are given: Yahapur: 25, 24, 26, 28, 30, 35, 39, 43, 49, 56, 59, 44. Wahapur: 19, 17, 23, 30, 38, 35, 42, 39, 53, 60, 52, 42. (a) Calculate the average price at each town. (b) Find the range (max − min) for each town. (c) Which town has more price stability? Justify your answer using at least two measures of comparison. [5 marks]
Part (a): Yahapur sum = 458, average = 458 ÷ 12 ≈ 38.17; Wahapur sum = 450, average = 450 ÷ 12 = 37.5. Part (b): Yahapur range = 59 − 24 = 35; Wahapur range = 60 − 17 = 43. Part (c): Compare using averages (close, but Yahapur slightly higher) and ranges (Yahapur smaller = more stable). Dot plot visualization can also support the conclusion about clustering or spread.
Practice with interactive flashcards, mind maps, upload your own chapters and get AI study kits instantly
Try StudyOS Free →